Notebooks
E
Elastic
ChatGPT And Elasticsearch The RAG Really Tied The App Together

ChatGPT And Elasticsearch The RAG Really Tied The App Together

openai-chatgptlangchain-pythonchatgptgenaielasticsearchelasticopenaiAIrag-ties-the-app-togetherchatlogvectordatabasePythonsearchgenaistacksupporting-blog-contentvectorelasticsearch-labslangchainapplications

ChatGPT and Elasticsearch: The RAG Really Tied the App Together

This notebook will show you how to:

  • Create an Elastics Serverless Project
  • Setup an Inference API
  • This will download and deploy ELSER for embedding inference
  • Create an index template
  • This will use semantic_text which will auto-chunk and embed the body of text
  • Use the Elastic Open Crawler to crawl the Elastic Search/Observability/Security Labs

The accompying blog takes it further by showing you how to:

  • Use Playground to test chat prompts and configurations
  • Then generate queries for our RAG app
  • Use the queries from Playground to finish out a RAG Chatbot app
  • Python FastAPI backend with React frontend
[ ]
[2]

Project Setup

Enter your Cloud API Key

Generate your secret API key at https://cloud.elastic.co/account/keys

[3]
Enter your API key: ··········
API key successfully entered!

Create Elasticsearch project

Serverless API Docs

[4]
Waiting for project to be ready. Current status:initializing - Loop 7 Sleeping 10 seconds
Project is ready

Create elasticsearch client

[5]

Project API Key

Create a Project level API key

[6]
full_access_key has been created

Inference API and Index Setup

Inference API

This will:

  • Create an inference API endpoint
  • Download ELSER model (if not already downloaded)
  • Deploy ELSER model with service_settings configs

Note - This will wait for ELSER to be downloaded and deployed

[7]
Waiting for inference model to be fully deployed
Inference API created and Inference model is fully deployed.

Create index template

The two key fields here are:

  • body
  • the field with the body of text and we use that as the source to copy to our semantic text field semantic_body
  • semantic_body
  • This field will automatically handle chunking and generating embeddings
[8]
{'acknowledged': True}

Crawl the docs

Open Crawler

This HAS TO BE RUN on a Linux/Mac/Windows host/vm NOT in colab

The blog details the steps below running on a Macbook

You can also review the Open Crawler setup.

High level steps to configure and run crawler

This HAS TO BE RUN on a Linux/Mac/Windows host/vm NOT in colab

  • Clone the repo
  • git clone git@github.com:elastic/crawler.git
  • Build the Open Crawler Docker container
  • docker build -t crawler-image . && docker run -i -d --name crawler crawler-image
  • Create a new config file
  • vi config/elastic-labs.yml
  • run the generate config cell below then paste the output in the config file and save.
  • Copy the new local config into the container
  • docker cp config/elastic-labs.yml crawler:/app/config/elastic-labs.yml
  • Run the crawler
  • docker exec -it crawler bin/crawler crawl config/elastic-labs.yml

Generate Config

Run the below cell to generate the yml config file

[ ]

Confirm the docs have been crawled

First look at the count of docs for each Labs' site

[20]
{'_shards': {'failed': 0, 'skipped': 0, 'successful': 5, 'total': 5},
 'aggregations': {'url_path_dir1': {'buckets': [{'doc_count': 216,
                                                 'key': 'search-labs'},
                                                {'doc_count': 214,
                                                 'key': 'security-labs'},
                                                {'doc_count': 158,
                                                 'key': 'observability-labs'}],
                                    'doc_count_error_upper_bound': 0,
                                    'sum_other_doc_count': 0}},
 'hits': {'hits': [],
          'max_score': None,
          'total': {'relation': 'eq', 'value': 588}},
 'timed_out': False,
 'took': 6}

Next review a sample doc

[23]
Streaming output truncated to the last 5000 lines.
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'exposes '
                                                                                    'a '
                                                                                    'list '
                                                                                    'of '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'values, '
                                                                                    'one '
                                                                                    'for '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'as '
                                                                                    'the '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pools '
                                                                                    '(which '
                                                                                    'handle '
                                                                                    'indexing '
                                                                                    'requests) '
                                                                                    'are '
                                                                                    'sized '
                                                                                    'based '
                                                                                    'on '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'CPU '
                                                                                    'cores '
                                                                                    'on '
                                                                                    'the '
                                                                                    'node, '
                                                                                    'this '
                                                                                    'essentially '
                                                                                    'determines '
                                                                                    'the '
                                                                                    'total '
                                                                                    'number '
                                                                                    'of '
                                                                                    'cores '
                                                                                    'that '
                                                                                    'is '
                                                                                    'needed '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload. '
                                                                                    'The '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'on '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'consists '
                                                                                    'of '
                                                                                    'two '
                                                                                    'components: '
                                                                                    'Thread '
                                                                                    'pool '
                                                                                    'utilization: '
                                                                                    'the '
                                                                                    'average '
                                                                                    'number '
                                                                                    'of '
                                                                                    'threads '
                                                                                    'in '
                                                                                    'the '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'processing '
                                                                                    'indexing '
                                                                                    'requests '
                                                                                    'during '
                                                                                    'that '
                                                                                    'sampling '
                                                                                    'period. '
                                                                                    'Queued '
                                                                                    'ingestion '
                                                                                    'load: '
                                                                                    'the '
                                                                                    'estimated '
                                                                                    'number '
                                                                                    'of '
                                                                                    'threads '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'queued '
                                                                                    'write '
                                                                                    'requests. '
                                                                                    'The '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'of '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'is '
                                                                                    'calculated '
                                                                                    'as '
                                                                                    'the '
                                                                                    'sum '
                                                                                    'of '
                                                                                    'these '
                                                                                    'two '
                                                                                    'values '
                                                                                    'for '
                                                                                    'all '
                                                                                    'the '
                                                                                    'three '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pools '
                                                                                    '. '
                                                                                    'The '
                                                                                    'total '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'of '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'is '
                                                                                    'the '
                                                                                    'sum '
                                                                                    'of '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'of '
                                                                                    'the '
                                                                                    'individual '
                                                                                    'nodes. '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    't '
                                                                                    'h'},
                                                                           {'embeddings': {'##est': 1.3433179,
                                                                                           '##estinal': 0.5916747,
                                                                                           '##ical': 0.21335103,
                                                                                           '##ing': 0.66160166,
                                                                                           '##ion': 1.223692,
                                                                                           '##l': 0.06755174,
                                                                                           '##ler': 0.34178317,
                                                                                           '##line': 0.6707441,
                                                                                           '##ling': 1.0343578,
                                                                                           '##load': 0.9880499,
                                                                                           '##mat': 0.01314945,
                                                                                           '##rch': 1.3459072,
                                                                                           '##s': 0.25005433,
                                                                                           '##sca': 1.6867673,
                                                                                           '##scu': 0.028700678,
                                                                                           '##sea': 1.6748068,
                                                                                           '_': 0.28835136,
                                                                                           'access': 0.116686985,
                                                                                           'accounting': 0.15865436,
                                                                                           'algorithm': 1.0487378,
                                                                                           'algorithms': 0.2763102,
                                                                                           'allocation': 0.1481772,
                                                                                           'amazon': 0.9099395,
                                                                                           'among': 0.04313716,
                                                                                           'anal': 0.025087006,
                                                                                           'analysis': 0.64178395,
                                                                                           'analyze': 0.18673302,
                                                                                           'and': 0.19101046,
                                                                                           'apache': 0.6617465,
                                                                                           'api': 1.4468017,
                                                                                           'approximate': 0.026616694,
                                                                                           'are': 0.19081613,
                                                                                           'arithmetic': 0.12217364,
                                                                                           'ass': 0.12156314,
                                                                                           'auto': 1.4633765,
                                                                                           'automatic': 0.73048806,
                                                                                           'availability': 0.20461462,
                                                                                           'average': 0.58710635,
                                                                                           'bot': 0.12357169,
                                                                                           'buffer': 0.14556783,
                                                                                           'calculate': 0.02387442,
                                                                                           'calculated': 0.2452304,
                                                                                           'calculation': 0.81089926,
                                                                                           'called': 0.2972479,
                                                                                           'capacity': 0.60224617,
                                                                                           'catalog': 0.078262925,
                                                                                           'category': 0.21683785,
                                                                                           'checkpoint': 0.012995078,
                                                                                           'chess': 0.41694775,
                                                                                           'chip': 0.10178017,
                                                                                           'class': 0.5914888,
                                                                                           'classification': 0.17686933,
                                                                                           'cluster': 1.4369037,
                                                                                           'clusters': 0.21254443,
                                                                                           'comply': 0.131236,
                                                                                           'component': 0.37191656,
                                                                                           'components': 0.87235415,
                                                                                           'computation': 0.47024545,
                                                                                           'compute': 0.14372817,
                                                                                           'computer': 0.397558,
                                                                                           'constant': 0.09540719,
                                                                                           'consumption': 0.123454005,
                                                                                           'cope': 0.7024604,
                                                                                           'core': 0.62535626,
                                                                                           'cores': 1.0230916,
                                                                                           'cpu': 0.874175,
                                                                                           'crawl': 0.23010625,
                                                                                           'current': 0.5516459,
                                                                                           'data': 0.25792596,
                                                                                           'database': 0.4601695,
                                                                                           'determine': 0.3844099,
                                                                                           'determined': 0.41348428,
                                                                                           'diagram': 0.025166756,
                                                                                           'dimensions': 0.07042265,
                                                                                           'disk': 0.07931721,
                                                                                           'each': 0.22229394,
                                                                                           'elastic': 1.8257822,
                                                                                           'enter': 0.058845505,
                                                                                           'equation': 0.43812877,
                                                                                           'es': 0.8055687,
                                                                                           'estimate': 0.03608101,
                                                                                           'estimated': 0.46266982,
                                                                                           'execution': 0.05638616,
                                                                                           'factors': 0.12973839,
                                                                                           'forest': 0.3904727,
                                                                                           'formula': 0.016075172,
                                                                                           'framework': 0.34186286,
                                                                                           'g': 0.08017753,
                                                                                           'gage': 0.30852094,
                                                                                           'gene': 0.27250904,
                                                                                           'handle': 0.9037246,
                                                                                           'handling': 0.69093794,
                                                                                           'implement': 0.053764082,
                                                                                           'index': 1.3896008,
                                                                                           'indexed': 0.25086805,
                                                                                           'ing': 1.5002296,
                                                                                           'integration': 0.20222682,
                                                                                           'interface': 0.25386703,
                                                                                           'inventory': 0.5645011,
                                                                                           'is': 0.05772473,
                                                                                           'java': 1.2391971,
                                                                                           'l': 0.048691455,
                                                                                           'lake': 0.24773102,
                                                                                           'lane': 0.25919613,
                                                                                           'lang': 0.039321195,
                                                                                           'learning': 0.033810128,
                                                                                           'library': 0.14143226,
                                                                                           'list': 0.10985089,
                                                                                           'lists': 0.12752165,
                                                                                           'load': 1.7350225,
                                                                                           'loaded': 0.057171866,
                                                                                           'loading': 0.75305617,
                                                                                           'loads': 0.12072936,
                                                                                           'log': 0.06388949,
                                                                                           'machine': 0.47294563,
                                                                                           'mass': 0.092697844,
                                                                                           'math': 0.7472431,
                                                                                           'matrix': 0.045127213,
                                                                                           'maximum': 0.094020285,
                                                                                           'measure': 0.32414404,
                                                                                           'memories': 0.03024405,
                                                                                           'memory': 1.2586498,
                                                                                           'method': 0.016832462,
                                                                                           'metric': 1.1439759,
                                                                                           'mining': 0.40203753,
                                                                                           'mp': 0.09331862,
                                                                                           'multi': 0.031247457,
                                                                                           'multiple': 0.38688186,
                                                                                           'n': 0.33228758,
                                                                                           'need': 0.19645856,
                                                                                           'network': 0.42359397,
                                                                                           'new': 0.041632555,
                                                                                           'node': 1.3807943,
                                                                                           'nodes': 0.63807905,
                                                                                           'number': 0.4450389,
                                                                                           'o': 0.50335085,
                                                                                           'operation': 0.008523868,
                                                                                           'order': 0.08601924,
                                                                                           'pattern': 0.11067777,
                                                                                           'percent': 0.13746342,
                                                                                           'performance': 0.41614294,
                                                                                           'period': 0.49507552,
                                                                                           'pool': 1.3188534,
                                                                                           'poole': 0.3433027,
                                                                                           'pools': 1.2800426,
                                                                                           'predict': 0.23377013,
                                                                                           'processing': 1.0733001,
                                                                                           'processor': 0.10840816,
                                                                                           'pure': 0.11351536,
                                                                                           'quantity': 0.109573685,
                                                                                           'queue': 1.1129105,
                                                                                           'ram': 0.14691876,
                                                                                           'rank': 0.36504152,
                                                                                           'ratio': 0.011385939,
                                                                                           'read': 0.13304754,
                                                                                           'represent': 0.42444453,
                                                                                           'representation': 0.058323957,
                                                                                           'request': 0.755568,
                                                                                           'requests': 0.7039498,
                                                                                           'routing': 0.060857404,
                                                                                           'sample': 0.62170815,
                                                                                           'sampling': 0.8610632,
                                                                                           'scala': 0.25192302,
                                                                                           'scale': 0.5968038,
                                                                                           'sea': 0.20613533,
                                                                                           'search': 0.4318061,
                                                                                           'semi': 0.33687106,
                                                                                           'sequence': 0.23863083,
                                                                                           'serial': 0.15801017,
                                                                                           'server': 0.16233677,
                                                                                           'si': 0.2002626,
                                                                                           'sid': 0.44975162,
                                                                                           'size': 0.8577202,
                                                                                           'sized': 0.21010487,
                                                                                           'sizes': 0.4059122,
                                                                                           'small': 0.09116832,
                                                                                           'software': 0.09232291,
                                                                                           'sort': 0.35720947,
                                                                                           'sorting': 0.06234357,
                                                                                           'spectrum': 0.07792632,
                                                                                           'sql': 0.116530605,
                                                                                           'statistical': 0.0852167,
                                                                                           'statistics': 0.22820702,
                                                                                           'stomach': 0.018201118,
                                                                                           'sum': 0.89766365,
                                                                                           'swarm': 0.20437151,
                                                                                           'table': 0.007837142,
                                                                                           'task': 0.37974054,
                                                                                           'taste': 0.053832427,
                                                                                           'taylor': 0.10206632,
                                                                                           'thread': 1.5052487,
                                                                                           'threads': 1.2515007,
                                                                                           'three': 0.27322263,
                                                                                           'total': 0.64918166,
                                                                                           'tree': 0.098200426,
                                                                                           'unit': 0.15584692,
                                                                                           'used': 0.56170344,
                                                                                           'useful': 0.34977943,
                                                                                           'utilization': 1.0091052,
                                                                                           'value': 0.7453479,
                                                                                           'values': 0.63835937,
                                                                                           'vector': 0.3917736,
                                                                                           'weaving': 0.11804886,
                                                                                           'web': 0.46383187,
                                                                                           'work': 0.29207155,
                                                                                           'write': 1.1660185,
                                                                                           'writing': 0.25973478,
                                                                                           'z': 0.3776876},
                                                                            'text': 'that '
                                                                                    'are '
                                                                                    'used '
                                                                                    'for '
                                                                                    'ingest '
                                                                                    'autoscaling '
                                                                                    'in '
                                                                                    'Elasticsearch '
                                                                                    'are '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'memory. '
                                                                                    'Ingestion '
                                                                                    'load '
                                                                                    'Ingestion '
                                                                                    'load '
                                                                                    'represents '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'threads '
                                                                                    'that '
                                                                                    'is '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'cope '
                                                                                    'with '
                                                                                    'the '
                                                                                    'current '
                                                                                    'indexing '
                                                                                    'load. '
                                                                                    'The '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'exposes '
                                                                                    'a '
                                                                                    'list '
                                                                                    'of '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'values, '
                                                                                    'one '
                                                                                    'for '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'as '
                                                                                    'the '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pools '
                                                                                    '(which '
                                                                                    'handle '
                                                                                    'indexing '
                                                                                    'requests) '
                                                                                    'are '
                                                                                    'sized '
                                                                                    'based '
                                                                                    'on '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'CPU '
                                                                                    'cores '
                                                                                    'on '
                                                                                    'the '
                                                                                    'node, '
                                                                                    'this '
                                                                                    'essentially '
                                                                                    'determines '
                                                                                    'the '
                                                                                    'total '
                                                                                    'number '
                                                                                    'of '
                                                                                    'cores '
                                                                                    'that '
                                                                                    'is '
                                                                                    'needed '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload. '
                                                                                    'The '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'on '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'consists '
                                                                                    'of '
                                                                                    'two '
                                                                                    'components: '
                                                                                    'Thread '
                                                                                    'pool '
                                                                                    'utilization: '
                                                                                    'the '
                                                                                    'average '
                                                                                    'number '
                                                                                    'of '
                                                                                    'threads '
                                                                                    'in '
                                                                                    'the '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'processing '
                                                                                    'indexing '
                                                                                    'requests '
                                                                                    'during '
                                                                                    'that '
                                                                                    'sampling '
                                                                                    'period. '
                                                                                    'Queued '
                                                                                    'ingestion '
                                                                                    'load: '
                                                                                    'the '
                                                                                    'estimated '
                                                                                    'number '
                                                                                    'of '
                                                                                    'threads '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'queued '
                                                                                    'write '
                                                                                    'requests. '
                                                                                    'The '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'of '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'is '
                                                                                    'calculated '
                                                                                    'as '
                                                                                    'the '
                                                                                    'sum '
                                                                                    'of '
                                                                                    'these '
                                                                                    'two '
                                                                                    'values '
                                                                                    'for '
                                                                                    'all '
                                                                                    'the '
                                                                                    'three '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pools '
                                                                                    '. '
                                                                                    'The '
                                                                                    'total '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'of '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'is '
                                                                                    'the '
                                                                                    'sum '
                                                                                    'of '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'of '
                                                                                    'the '
                                                                                    'individual '
                                                                                    'nodes. '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    't '
                                                                                    'h '
                                                                                    'r '
                                                                                    'e '
                                                                                    'a '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'p '
                                                                                    'o '
                                                                                    'o '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'l '
                                                                                    'i '
                                                                                    'z '
                                                                                    'a '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '+ '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    't '
                                                                                    'o '
                                                                                    't '
                                                                                    'a '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    '\\small '
                                                                                    'node\\_ingestion\\_load '
                                                                                    '= '
                                                                                    '\\sum(thread\\_pool\\_utilization '
                                                                                    '+ '
                                                                                    'queued\\_ingestion\\_load) '
                                                                                    '\\newline '
                                                                                    'total\\_ingestion\\_load '
                                                                                    '= '
                                                                                    '\\sum(node\\_ingestion\\_load) '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    't '
                                                                                    'h '
                                                                                    're '
                                                                                    'a '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'p '
                                                                                    'oo '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'l '
                                                                                    'i '
                                                                                    'z '
                                                                                    'a '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '+ '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    't '
                                                                                    'o '
                                                                                    't '
                                                                                    'a '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    'Figure '
                                                                                    '2 '
                                                                                    ': '
                                                                                    'ingestion'},
                                                                           {'embeddings': {'##able': 0.5624876,
                                                                                           '##ba': 0.10684605,
                                                                                           '##d': 0.12233314,
                                                                                           '##est': 0.84587747,
                                                                                           '##ima': 0.2508807,
                                                                                           '##ing': 0.57414246,
                                                                                           '##ion': 1.1121849,
                                                                                           '##line': 1.1430916,
                                                                                           '##ma': 1.1706055,
                                                                                           '##w': 1.3673741,
                                                                                           '##ws': 0.33763555,
                                                                                           '10': 0.51392806,
                                                                                           '200': 0.73087466,
                                                                                           '30': 0.45019,
                                                                                           '60': 1.3045075,
                                                                                           '[UNK]': 0.2956499,
                                                                                           '_': 0.33742356,
                                                                                           'acceptable': 0.29635867,
                                                                                           'access': 0.23300913,
                                                                                           'accounting': 0.1906402,
                                                                                           'achieve': 0.19722655,
                                                                                           'algorithm': 1.1037958,
                                                                                           'algorithms': 0.26360378,
                                                                                           'allocation': 0.53156596,
                                                                                           'analysis': 0.41347402,
                                                                                           'apache': 0.54295164,
                                                                                           'api': 0.21713388,
                                                                                           'approximate': 0.51163644,
                                                                                           'arithmetic': 0.005784557,
                                                                                           'availability': 0.4917338,
                                                                                           'average': 0.8478212,
                                                                                           'batch': 0.08666975,
                                                                                           'blocking': 0.02501016,
                                                                                           'bot': 0.06050198,
                                                                                           'buffer': 0.40386045,
                                                                                           'bug': 0.055751722,
                                                                                           'busy': 1.3026394,
                                                                                           'calculate': 0.26999432,
                                                                                           'calculation': 0.74316484,
                                                                                           'capacity': 0.6725085,
                                                                                           'chess': 0.25134456,
                                                                                           'class': 0.328252,
                                                                                           'client': 0.23896244,
                                                                                           'clock': 1.125488,
                                                                                           'cluster': 0.5103067,
                                                                                           'component': 0.2536751,
                                                                                           'components': 0.78435194,
                                                                                           'computation': 0.62016183,
                                                                                           'compute': 0.06482519,
                                                                                           'computer': 0.32330835,
                                                                                           'concurrency': 0.011380989,
                                                                                           'configuration': 0.6887391,
                                                                                           'configured': 0.26263618,
                                                                                           'constant': 0.29082793,
                                                                                           'consumption': 0.16989039,
                                                                                           'cpu': 0.3717718,
                                                                                           'database': 0.13461274,
                                                                                           'e': 0.7789312,
                                                                                           'effect': 0.09419204,
                                                                                           'effort': 0.055172946,
                                                                                           'employee': 0.3274528,
                                                                                           'employees': 0.14320064,
                                                                                           'ensemble': 0.19942468,
                                                                                           'equation': 0.3787911,
                                                                                           'equivalent': 0.050270963,
                                                                                           'error': 0.12898737,
                                                                                           'es': 0.043630168,
                                                                                           'est': 0.20599021,
                                                                                           'estimate': 1.0792123,
                                                                                           'estimated': 0.39457676,
                                                                                           'estimates': 0.465428,
                                                                                           'estimation': 0.080784135,
                                                                                           'every': 0.16873945,
                                                                                           'excess': 1.0022457,
                                                                                           'excessive': 0.451759,
                                                                                           'execute': 0.59175754,
                                                                                           'executing': 0.091966435,
                                                                                           'execution': 1.3065349,
                                                                                           'existing': 0.6437884,
                                                                                           'exponential': 1.1467187,
                                                                                           'extra': 0.26056916,
                                                                                           'figure': 0.019528389,
                                                                                           'finish': 0.012790194,
                                                                                           'finished': 0.21236378,
                                                                                           'flow': 0.10995065,
                                                                                           'g': 0.43504617,
                                                                                           'gage': 0.4229588,
                                                                                           'group': 0.43960038,
                                                                                           'guild': 0.014967873,
                                                                                           'handle': 0.80899215,
                                                                                           'handling': 0.7681083,
                                                                                           'heap': 0.3867438,
                                                                                           'hours': 0.7462872,
                                                                                           'http': 0.20072725,
                                                                                           'implement': 0.16245411,
                                                                                           'implementation': 0.2408709,
                                                                                           'improve': 0.10136651,
                                                                                           'index': 1.2976965,
                                                                                           'indexed': 0.10614389,
                                                                                           'ing': 1.2063053,
                                                                                           'inventory': 0.25356865,
                                                                                           'java': 1.2153534,
                                                                                           'l': 0.48968774,
                                                                                           'lake': 0.27167574,
                                                                                           'lane': 0.54473066,
                                                                                           'length': 0.64622724,
                                                                                           'library': 0.08392323,
                                                                                           'line': 0.5581907,
                                                                                           'load': 1.5088638,
                                                                                           'loading': 0.5335804,
                                                                                           'machine': 0.3173762,
                                                                                           'manage': 0.5220977,
                                                                                           'managed': 0.45824686,
                                                                                           'management': 0.3230387,
                                                                                           'mass': 0.15742503,
                                                                                           'math': 0.81244004,
                                                                                           'maximum': 0.34374076,
                                                                                           'measure': 0.25600985,
                                                                                           'memory': 0.5085309,
                                                                                           'mining': 0.4451848,
                                                                                           'minute': 0.39483455,
                                                                                           'minutes': 0.22895378,
                                                                                           'moving': 0.76410496,
                                                                                           'mp': 0.046217,
                                                                                           'multiple': 0.10666605,
                                                                                           'n': 0.5416694,
                                                                                           'network': 0.3097243,
                                                                                           'new': 0.49582836,
                                                                                           'node': 1.1907045,
                                                                                           'number': 0.47905272,
                                                                                           'o': 0.47123736,
                                                                                           'operation': 0.19577809,
                                                                                           'optimal': 0.1733028,
                                                                                           'par': 0.09612937,
                                                                                           'percent': 0.1152151,
                                                                                           'performance': 0.74001515,
                                                                                           'pool': 1.7006081,
                                                                                           'poole': 0.36192703,
                                                                                           'pools': 1.0764378,
                                                                                           'predict': 0.38117534,
                                                                                           'probe': 0.2430691,
                                                                                           'process': 0.12230635,
                                                                                           'processing': 0.47061718,
                                                                                           'proportion': 0.2145018,
                                                                                           'proportional': 1.1204233,
                                                                                           'proposal': 0.1401456,
                                                                                           'q': 0.3259466,
                                                                                           'queue': 1.580318,
                                                                                           'r': 0.14266703,
                                                                                           'rank': 0.13613336,
                                                                                           'rate': 0.39469108,
                                                                                           'request': 1.1001134,
                                                                                           'requests': 0.63539153,
                                                                                           'resolution': 0.055606272,
                                                                                           'resource': 0.21417612,
                                                                                           'resources': 0.7937882,
                                                                                           'routing': 0.14261606,
                                                                                           'sample': 1.0720835,
                                                                                           'sampled': 1.0306277,
                                                                                           'samples': 1.2079935,
                                                                                           'sampling': 0.6740413,
                                                                                           'scala': 0.07395835,
                                                                                           'script': 0.10171158,
                                                                                           'second': 0.18827602,
                                                                                           'seconds': 0.817573,
                                                                                           'sequence': 0.49634397,
                                                                                           'serial': 0.033651996,
                                                                                           'server': 0.32002103,
                                                                                           'share': 0.27626935,
                                                                                           'sid': 0.27850676,
                                                                                           'size': 0.11843514,
                                                                                           'small': 0.75451213,
                                                                                           'speed': 0.30091006,
                                                                                           'sql': 0.31397846,
                                                                                           'statistical': 0.0100006005,
                                                                                           'strategy': 0.08963276,
                                                                                           'stream': 0.028335843,
                                                                                           'sum': 1.1407199,
                                                                                           'surplus': 0.15598625,
                                                                                           'swarm': 0.054142684,
                                                                                           'task': 1.2177191,
                                                                                           'tasks': 1.0780356,
                                                                                           'taylor': 0.24217507,
                                                                                           'technique': 0.0030198945,
                                                                                           'thread': 1.7842301,
                                                                                           'threads': 0.9916815,
                                                                                           'time': 0.9839317,
                                                                                           'timer': 0.19039534,
                                                                                           'times': 0.5299459,
                                                                                           'total': 0.40682667,
                                                                                           'traffic': 0.28910428,
                                                                                           'universe': 0.013594781,
                                                                                           'usage': 0.5520448,
                                                                                           'utilization': 1.6104044,
                                                                                           'value': 0.6036144,
                                                                                           'values': 0.33944046,
                                                                                           'w': 0.4972394,
                                                                                           'wait': 0.005872378,
                                                                                           'wall': 1.1351137,
                                                                                           'weaving': 0.13777943,
                                                                                           'web': 0.2821159,
                                                                                           'weighted': 1.1533256,
                                                                                           'worker': 1.0417976,
                                                                                           'workers': 1.2245823,
                                                                                           'z': 0.29032487},
                                                                            'text': 'r '
                                                                                    'e '
                                                                                    'a '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'p '
                                                                                    'o '
                                                                                    'o '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'l '
                                                                                    'i '
                                                                                    'z '
                                                                                    'a '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '+ '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    't '
                                                                                    'o '
                                                                                    't '
                                                                                    'a '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    '\\small '
                                                                                    'node\\_ingestion\\_load '
                                                                                    '= '
                                                                                    '\\sum(thread\\_pool\\_utilization '
                                                                                    '+ '
                                                                                    'queued\\_ingestion\\_load) '
                                                                                    '\\newline '
                                                                                    'total\\_ingestion\\_load '
                                                                                    '= '
                                                                                    '\\sum(node\\_ingestion\\_load) '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    't '
                                                                                    'h '
                                                                                    're '
                                                                                    'a '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'p '
                                                                                    'oo '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'l '
                                                                                    'i '
                                                                                    'z '
                                                                                    'a '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '+ '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    't '
                                                                                    'o '
                                                                                    't '
                                                                                    'a '
                                                                                    'l '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '∑ '
                                                                                    '( '
                                                                                    'n '
                                                                                    'o '
                                                                                    'd '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    ') '
                                                                                    'Figure '
                                                                                    '2 '
                                                                                    ': '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'components '
                                                                                    'The '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'utilization '
                                                                                    'is '
                                                                                    'an '
                                                                                    'exponentially '
                                                                                    'weighted '
                                                                                    'moving '
                                                                                    'average '
                                                                                    '(EWMA) '
                                                                                    'of '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'busy '
                                                                                    'threads '
                                                                                    'in '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool, '
                                                                                    'sampled '
                                                                                    'every '
                                                                                    'second. '
                                                                                    'The '
                                                                                    'EWMA '
                                                                                    'of '
                                                                                    'the '
                                                                                    'sampled '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'utilization '
                                                                                    'values '
                                                                                    'is '
                                                                                    'configured '
                                                                                    'such '
                                                                                    'that '
                                                                                    'the '
                                                                                    'sampled '
                                                                                    'values '
                                                                                    'of '
                                                                                    'the '
                                                                                    'past '
                                                                                    '10 '
                                                                                    'seconds '
                                                                                    'have '
                                                                                    'the '
                                                                                    'most '
                                                                                    'effect '
                                                                                    'on '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'utilization '
                                                                                    'component '
                                                                                    'of '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'samples '
                                                                                    'older '
                                                                                    'than '
                                                                                    '60 '
                                                                                    'seconds '
                                                                                    'have '
                                                                                    'very '
                                                                                    'negligible '
                                                                                    'impact. '
                                                                                    'To '
                                                                                    'estimate '
                                                                                    'the '
                                                                                    'resources '
                                                                                    'required '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'queued '
                                                                                    'indexing '
                                                                                    'requests '
                                                                                    'in '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool, '
                                                                                    'we '
                                                                                    'need '
                                                                                    'to '
                                                                                    'have '
                                                                                    'an '
                                                                                    'estimate '
                                                                                    'for '
                                                                                    'how '
                                                                                    'long '
                                                                                    'each '
                                                                                    'queued '
                                                                                    'task '
                                                                                    'can '
                                                                                    'take '
                                                                                    'to '
                                                                                    'execute. '
                                                                                    'To '
                                                                                    'achieve '
                                                                                    'this, '
                                                                                    'each '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'also '
                                                                                    'provides '
                                                                                    'an '
                                                                                    'EWMA '
                                                                                    'of '
                                                                                    'the '
                                                                                    'request '
                                                                                    'execution '
                                                                                    'time. '
                                                                                    'The '
                                                                                    'request '
                                                                                    'execution '
                                                                                    'time '
                                                                                    'for '
                                                                                    'an '
                                                                                    'indexing '
                                                                                    'request '
                                                                                    'is '
                                                                                    'the '
                                                                                    '(wall-clock) '
                                                                                    'time '
                                                                                    'taken '
                                                                                    'for '
                                                                                    'the '
                                                                                    'request '
                                                                                    'to '
                                                                                    'finish '
                                                                                    'once '
                                                                                    'it '
                                                                                    'is '
                                                                                    'out '
                                                                                    'of '
                                                                                    'the '
                                                                                    'queue '
                                                                                    'and '
                                                                                    'a '
                                                                                    'worker '
                                                                                    'thread '
                                                                                    'starts '
                                                                                    'executing '
                                                                                    'it. '
                                                                                    'As '
                                                                                    'some '
                                                                                    'queueing '
                                                                                    'is '
                                                                                    'acceptable '
                                                                                    'and '
                                                                                    'should '
                                                                                    'be '
                                                                                    'manageable '
                                                                                    'by '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool, '
                                                                                    'we '
                                                                                    'try '
                                                                                    'to '
                                                                                    'estimate '
                                                                                    'the '
                                                                                    'resources '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'excess '
                                                                                    'queueing. '
                                                                                    'We '
                                                                                    'consider '
                                                                                    'up '
                                                                                    'to '
                                                                                    '30s '
                                                                                    'worth '
                                                                                    'of '
                                                                                    'tasks '
                                                                                    'in '
                                                                                    'the '
                                                                                    'queue '
                                                                                    'manageable '
                                                                                    'by '
                                                                                    'the '
                                                                                    'existing '
                                                                                    'number '
                                                                                    'of '
                                                                                    'workers '
                                                                                    'and '
                                                                                    'account '
                                                                                    'for '
                                                                                    'an '
                                                                                    'extra '
                                                                                    'thread '
                                                                                    'proportional '
                                                                                    'to '
                                                                                    'this '
                                                                                    'value. '
                                                                                    'For '
                                                                                    'example, '
                                                                                    'if '
                                                                                    'the '
                                                                                    'average '
                                                                                    'task '
                                                                                    'execution '
                                                                                    'time '
                                                                                    'is '
                                                                                    '200ms, '
                                                                                    'we '
                                                                                    'estimate '
                                                                                    'that'},
                                                                           {'embeddings': {'##d': 0.06352329,
                                                                                           '##est': 0.89852107,
                                                                                           '##estinal': 0.13183321,
                                                                                           '##ima': 0.40056115,
                                                                                           '##ing': 0.61320734,
                                                                                           '##ion': 0.72260284,
                                                                                           '##ling': 0.8949169,
                                                                                           '##load': 0.57369965,
                                                                                           '##m': 0.23721623,
                                                                                           '##ma': 1.4438714,
                                                                                           '##mas': 0.24820994,
                                                                                           '##mat': 0.24343531,
                                                                                           '##sca': 0.92204034,
                                                                                           '##w': 1.6598973,
                                                                                           '##ws': 0.6782139,
                                                                                           '10': 0.7749067,
                                                                                           '150': 1.2471286,
                                                                                           '200': 0.58304185,
                                                                                           '30': 1.076181,
                                                                                           '60': 1.1588365,
                                                                                           '_': 0.17651597,
                                                                                           'acceptable': 0.0395143,
                                                                                           'access': 0.05357292,
                                                                                           'accounting': 0.22549874,
                                                                                           'achieve': 0.040418815,
                                                                                           'algorithm': 0.9928478,
                                                                                           'algorithms': 0.08838318,
                                                                                           'allocation': 0.7647576,
                                                                                           'analysis': 0.428812,
                                                                                           'apache': 0.5859765,
                                                                                           'api': 0.016843364,
                                                                                           'approximate': 0.21684457,
                                                                                           'arithmetic': 0.053462975,
                                                                                           'array': 0.066098064,
                                                                                           'auto': 0.53497416,
                                                                                           'automatic': 0.20355695,
                                                                                           'availability': 0.6690054,
                                                                                           'average': 1.0341543,
                                                                                           'blocking': 0.1431715,
                                                                                           'buffer': 0.46087772,
                                                                                           'bug': 0.23163809,
                                                                                           'busy': 1.3082193,
                                                                                           'calculate': 0.2015065,
                                                                                           'calculation': 0.71491575,
                                                                                           'capacity': 0.8027149,
                                                                                           'checkpoint': 0.10162155,
                                                                                           'chess': 0.26765594,
                                                                                           'class': 0.5377411,
                                                                                           'client': 0.028412435,
                                                                                           'clock': 0.81897706,
                                                                                           'cluster': 0.6336233,
                                                                                           'component': 1.2550238,
                                                                                           'components': 1.4753778,
                                                                                           'computation': 0.5360401,
                                                                                           'compute': 0.09496682,
                                                                                           'computer': 0.48583803,
                                                                                           'computers': 0.082595915,
                                                                                           'computing': 0.0053236387,
                                                                                           'concept': 0.09244595,
                                                                                           'concurrency': 0.080570355,
                                                                                           'configuration': 0.63552403,
                                                                                           'configured': 0.49945095,
                                                                                           'constant': 0.15874276,
                                                                                           'consumption': 0.3705247,
                                                                                           'count': 0.15291668,
                                                                                           'cpu': 0.4727478,
                                                                                           'data': 0.5534523,
                                                                                           'database': 0.24513115,
                                                                                           'definition': 0.25252765,
                                                                                           'dew': 0.027248075,
                                                                                           'disadvantage': 0.043538865,
                                                                                           'disk': 1.0258542,
                                                                                           'during': 0.024176076,
                                                                                           'e': 1.3067937,
                                                                                           'each': 0.01788934,
                                                                                           'ec': 0.5695534,
                                                                                           'ee': 0.08090695,
                                                                                           'effect': 0.33151782,
                                                                                           'employee': 0.14918438,
                                                                                           'employees': 0.026578736,
                                                                                           'equation': 0.42684066,
                                                                                           'es': 0.18498634,
                                                                                           'est': 0.098570675,
                                                                                           'estimate': 0.83097947,
                                                                                           'estimated': 0.19130428,
                                                                                           'estimates': 0.04933924,
                                                                                           'every': 0.384432,
                                                                                           'excess': 0.44124436,
                                                                                           'execute': 0.56965685,
                                                                                           'execution': 1.092663,
                                                                                           'exponential': 1.2772857,
                                                                                           'extra': 0.3341091,
                                                                                           'finish': 0.47172138,
                                                                                           'finished': 0.5516902,
                                                                                           'flow': 0.1065439,
                                                                                           'fra': 0.5131407,
                                                                                           'gage': 0.41627494,
                                                                                           'group': 0.40121686,
                                                                                           'handle': 0.76723486,
                                                                                           'handling': 0.8265911,
                                                                                           'hardware': 0.007931168,
                                                                                           'heap': 0.055197764,
                                                                                           'hours': 0.5783272,
                                                                                           'http': 0.16334121,
                                                                                           'implement': 0.20851848,
                                                                                           'improve': 0.033503063,
                                                                                           'index': 1.351592,
                                                                                           'indexed': 1.2516088,
                                                                                           'ing': 1.2539797,
                                                                                           'inventory': 0.26884475,
                                                                                           'io': 0.49151403,
                                                                                           'is': 0.67021686,
                                                                                           'items': 0.30828458,
                                                                                           'java': 1.233984,
                                                                                           'lake': 0.37700737,
                                                                                           'lane': 0.35798323,
                                                                                           'lang': 0.11334816,
                                                                                           'length': 0.39039937,
                                                                                           'library': 0.0020271246,
                                                                                           'load': 1.839116,
                                                                                           'loading': 0.52925104,
                                                                                           'log': 0.026120221,
                                                                                           'ma': 0.37466413,
                                                                                           'machine': 0.41295668,
                                                                                           'managed': 0.016499385,
                                                                                           'management': 0.24261811,
                                                                                           'many': 0.0001822544,
                                                                                           'map': 0.16712263,
                                                                                           'mat': 0.08338378,
                                                                                           'math': 0.69625205,
                                                                                           'maximum': 0.34880605,
                                                                                           'mb': 0.37918818,
                                                                                           'measure': 0.14309268,
                                                                                           'memory': 0.58699423,
                                                                                           'metric': 0.113157846,
                                                                                           'mill': 0.087879546,
                                                                                           'minimum': 0.042228475,
                                                                                           'mining': 0.31173173,
                                                                                           'minute': 0.2855463,
                                                                                           'minutes': 0.037687548,
                                                                                           'mm': 0.04705554,
                                                                                           'move': 0.24638273,
                                                                                           'moving': 1.068798,
                                                                                           'mp': 0.339956,
                                                                                           'mt': 0.18115476,
                                                                                           'multi': 0.045562405,
                                                                                           'multiple': 0.2256053,
                                                                                           'n': 0.20722932,
                                                                                           'network': 0.2870649,
                                                                                           'node': 0.74391615,
                                                                                           'nodes': 0.40956134,
                                                                                           'number': 0.5414315,
                                                                                           'object': 0.36274558,
                                                                                           'old': 0.026420968,
                                                                                           'older': 0.14505674,
                                                                                           'operation': 0.137978,
                                                                                           'optimal': 0.03703803,
                                                                                           'par': 0.0058114612,
                                                                                           'parts': 0.011510156,
                                                                                           'past': 0.25731233,
                                                                                           'percent': 0.35817072,
                                                                                           'performance': 0.801656,
                                                                                           'pool': 1.8708751,
                                                                                           'poole': 0.2727913,
                                                                                           'pools': 1.2964886,
                                                                                           'population': 0.11810607,
                                                                                           'predict': 0.18177378,
                                                                                           'probe': 0.21369988,
                                                                                           'processing': 0.4105097,
                                                                                           'proportional': 0.6098035,
                                                                                           'q': 0.13568267,
                                                                                           'queue': 1.2824515,
                                                                                           'rank': 0.40675223,
                                                                                           'rate': 0.46714726,
                                                                                           'request': 0.949167,
                                                                                           'requests': 0.6644938,
                                                                                           'requirements': 0.3288823,
                                                                                           'resource': 0.4609863,
                                                                                           'resources': 0.9455237,
                                                                                           'routing': 0.18650433,
                                                                                           'sample': 1.0472832,
                                                                                           'sampled': 0.8309003,
                                                                                           'samples': 1.1415888,
                                                                                           'sampling': 0.45636305,
                                                                                           'scala': 0.12271185,
                                                                                           'scale': 0.3144392,
                                                                                           'second': 0.49777645,
                                                                                           'seconds': 0.7695267,
                                                                                           'sequence': 0.21608938,
                                                                                           'serial': 0.049026124,
                                                                                           'server': 0.37191278,
                                                                                           'share': 0.19251333,
                                                                                           'si': 0.020900367,
                                                                                           'sid': 0.41317028,
                                                                                           'size': 0.7470095,
                                                                                           'sizes': 0.060290556,
                                                                                           'small': 0.015217632,
                                                                                           'speed': 0.21846266,
                                                                                           'sql': 0.39542097,
                                                                                           'stack': 0.047259662,
                                                                                           'start': 0.15702806,
                                                                                           'statistical': 0.031916108,
                                                                                           'statistics': 0.08593676,
                                                                                           'storage': 0.034532573,
                                                                                           'store': 0.053150244,
                                                                                           'survey': 0.1747176,
                                                                                           'system': 0.08567025,
                                                                                           'table': 0.006464522,
                                                                                           'task': 1.1504556,
                                                                                           'tasks': 0.7951614,
                                                                                           'taylor': 0.14394312,
                                                                                           'term': 0.63525033,
                                                                                           'thirty': 0.26077473,
                                                                                           'thread': 2.0543768,
                                                                                           'threads': 1.1089593,
                                                                                           'tier': 1.207179,
                                                                                           'time': 0.68932414,
                                                                                           'timer': 0.14907645,
                                                                                           'times': 0.32087305,
                                                                                           'total': 0.22359692,
                                                                                           'traffic': 0.26179498,
                                                                                           'trial': 0.2198535,
                                                                                           'u': 0.064360306,
                                                                                           'unit': 0.13278264,
                                                                                           'usage': 0.6241088,
                                                                                           'utilization': 1.6971744,
                                                                                           'value': 0.66488856,
                                                                                           'values': 0.2064584,
                                                                                           'w': 0.81893605,
                                                                                           'wait': 0.103130125,
                                                                                           'wall': 1.0635448,
                                                                                           'weaving': 0.07162173,
                                                                                           'web': 0.23646998,
                                                                                           'weight': 0.030211551,
                                                                                           'weighted': 1.2184887,
                                                                                           'work': 0.23164386,
                                                                                           'worker': 0.7420831,
                                                                                           'workers': 1.0619413,
                                                                                           'ze': 0.40276462},
                                                                            'text': 'load '
                                                                                    'components '
                                                                                    'The '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'utilization '
                                                                                    'is '
                                                                                    'an '
                                                                                    'exponentially '
                                                                                    'weighted '
                                                                                    'moving '
                                                                                    'average '
                                                                                    '(EWMA) '
                                                                                    'of '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'busy '
                                                                                    'threads '
                                                                                    'in '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool, '
                                                                                    'sampled '
                                                                                    'every '
                                                                                    'second. '
                                                                                    'The '
                                                                                    'EWMA '
                                                                                    'of '
                                                                                    'the '
                                                                                    'sampled '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'utilization '
                                                                                    'values '
                                                                                    'is '
                                                                                    'configured '
                                                                                    'such '
                                                                                    'that '
                                                                                    'the '
                                                                                    'sampled '
                                                                                    'values '
                                                                                    'of '
                                                                                    'the '
                                                                                    'past '
                                                                                    '10 '
                                                                                    'seconds '
                                                                                    'have '
                                                                                    'the '
                                                                                    'most '
                                                                                    'effect '
                                                                                    'on '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'utilization '
                                                                                    'component '
                                                                                    'of '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'samples '
                                                                                    'older '
                                                                                    'than '
                                                                                    '60 '
                                                                                    'seconds '
                                                                                    'have '
                                                                                    'very '
                                                                                    'negligible '
                                                                                    'impact. '
                                                                                    'To '
                                                                                    'estimate '
                                                                                    'the '
                                                                                    'resources '
                                                                                    'required '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'queued '
                                                                                    'indexing '
                                                                                    'requests '
                                                                                    'in '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool, '
                                                                                    'we '
                                                                                    'need '
                                                                                    'to '
                                                                                    'have '
                                                                                    'an '
                                                                                    'estimate '
                                                                                    'for '
                                                                                    'how '
                                                                                    'long '
                                                                                    'each '
                                                                                    'queued '
                                                                                    'task '
                                                                                    'can '
                                                                                    'take '
                                                                                    'to '
                                                                                    'execute. '
                                                                                    'To '
                                                                                    'achieve '
                                                                                    'this, '
                                                                                    'each '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'also '
                                                                                    'provides '
                                                                                    'an '
                                                                                    'EWMA '
                                                                                    'of '
                                                                                    'the '
                                                                                    'request '
                                                                                    'execution '
                                                                                    'time. '
                                                                                    'The '
                                                                                    'request '
                                                                                    'execution '
                                                                                    'time '
                                                                                    'for '
                                                                                    'an '
                                                                                    'indexing '
                                                                                    'request '
                                                                                    'is '
                                                                                    'the '
                                                                                    '(wall-clock) '
                                                                                    'time '
                                                                                    'taken '
                                                                                    'for '
                                                                                    'the '
                                                                                    'request '
                                                                                    'to '
                                                                                    'finish '
                                                                                    'once '
                                                                                    'it '
                                                                                    'is '
                                                                                    'out '
                                                                                    'of '
                                                                                    'the '
                                                                                    'queue '
                                                                                    'and '
                                                                                    'a '
                                                                                    'worker '
                                                                                    'thread '
                                                                                    'starts '
                                                                                    'executing '
                                                                                    'it. '
                                                                                    'As '
                                                                                    'some '
                                                                                    'queueing '
                                                                                    'is '
                                                                                    'acceptable '
                                                                                    'and '
                                                                                    'should '
                                                                                    'be '
                                                                                    'manageable '
                                                                                    'by '
                                                                                    'the '
                                                                                    'thread '
                                                                                    'pool, '
                                                                                    'we '
                                                                                    'try '
                                                                                    'to '
                                                                                    'estimate '
                                                                                    'the '
                                                                                    'resources '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'excess '
                                                                                    'queueing. '
                                                                                    'We '
                                                                                    'consider '
                                                                                    'up '
                                                                                    'to '
                                                                                    '30s '
                                                                                    'worth '
                                                                                    'of '
                                                                                    'tasks '
                                                                                    'in '
                                                                                    'the '
                                                                                    'queue '
                                                                                    'manageable '
                                                                                    'by '
                                                                                    'the '
                                                                                    'existing '
                                                                                    'number '
                                                                                    'of '
                                                                                    'workers '
                                                                                    'and '
                                                                                    'account '
                                                                                    'for '
                                                                                    'an '
                                                                                    'extra '
                                                                                    'thread '
                                                                                    'proportional '
                                                                                    'to '
                                                                                    'this '
                                                                                    'value. '
                                                                                    'For '
                                                                                    'example, '
                                                                                    'if '
                                                                                    'the '
                                                                                    'average '
                                                                                    'task '
                                                                                    'execution '
                                                                                    'time '
                                                                                    'is '
                                                                                    '200ms, '
                                                                                    'we '
                                                                                    'estimate '
                                                                                    'that '
                                                                                    'each '
                                                                                    'thread '
                                                                                    'is '
                                                                                    'able '
                                                                                    'to '
                                                                                    'handle '
                                                                                    '150 '
                                                                                    'indexing '
                                                                                    'requests '
                                                                                    'within '
                                                                                    '30s, '
                                                                                    'and '
                                                                                    'therefore '
                                                                                    'account '
                                                                                    'for '
                                                                                    'one '
                                                                                    'extra '
                                                                                    'thread '
                                                                                    'for '
                                                                                    'each '
                                                                                    '150 '
                                                                                    'queued '
                                                                                    'items. '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    '_ '
                                                                                    's '
                                                                                    'i '
                                                                                    'z '
                                                                                    'e '
                                                                                    '× '
                                                                                    'a '
                                                                                    'v '
                                                                                    'e '
                                                                                    'r '
                                                                                    'a '
                                                                                    'g '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'r '
                                                                                    'e '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    '_ '
                                                                                    'e '
                                                                                    'x '
                                                                                    'e '
                                                                                    'c '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    't '
                                                                                    'i '
                                                                                    'm '
                                                                                    'e '
                                                                                    '30 '
                                                                                    's '
                                                                                    '\\small '
                                                                                    'queued\\_ingestion\\_load '
                                                                                    '= '
                                                                                    '\\frac{queue\\_size '
                                                                                    '\\times '
                                                                                    'average\\_request\\_execution\\_time}{30s} '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '30 '
                                                                                    's '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    '_ '
                                                                                    's '
                                                                                    'i '
                                                                                    'ze '
                                                                                    '× '
                                                                                    'a '
                                                                                    'v '
                                                                                    'er '
                                                                                    'a '
                                                                                    'g '
                                                                                    'e '
                                                                                    '_ '
                                                                                    're '
                                                                                    'q '
                                                                                    'u '
                                                                                    'es '
                                                                                    't '
                                                                                    '_ '
                                                                                    'e '
                                                                                    'x '
                                                                                    'ec '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    't '
                                                                                    'im '
                                                                                    'e '
                                                                                    '\u200b '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'since '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'nodes '
                                                                                    'rely '
                                                                                    'on '
                                                                                    'pushing '
                                                                                    'indexed '
                                                                                    'data '
                                                                                    'into '
                                                                                    'the '
                                                                                    'object '
                                                                                    'store '
                                                                                    'periodically, '
                                                                                    'we '
                                                                                    'do '
                                                                                    'not '
                                                                                    'need '
                                                                                    'to '
                                                                                    'scale '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier '
                                                                                    'based '
                                                                                    'on '
                                                                                    'the '
                                                                                    'total '
                                                                                    'size '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indexed '
                                                                                    'data. '
                                                                                    'However, '
                                                                                    'the '
                                                                                    'disk '
                                                                                    'IO '
                                                                                    'requirements '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload '
                                                                                    'needs '
                                                                                    'to '
                                                                                    'be '
                                                                                    'considered '
                                                                                    'for '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'decisions. '
                                                                                    'The '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'represents'},
                                                                           {'embeddings': {'##d': 0.38506436,
                                                                                           '##est': 0.8363302,
                                                                                           '##frame': 0.039107077,
                                                                                           '##ing': 1.0441189,
                                                                                           '##ion': 1.1721121,
                                                                                           '##ler': 1.0595164,
                                                                                           '##ling': 0.99718106,
                                                                                           '##load': 0.8622203,
                                                                                           '##s': 0.26257822,
                                                                                           '##sca': 1.4883617,
                                                                                           '(': 0.04112861,
                                                                                           '120': 0.10787471,
                                                                                           '150': 1.5649581,
                                                                                           '200': 0.78864884,
                                                                                           '30': 1.3745978,
                                                                                           '300': 0.21148267,
                                                                                           '50': 0.031711366,
                                                                                           '500': 0.8493792,
                                                                                           '_': 0.24777141,
                                                                                           'accounting': 0.64968836,
                                                                                           'additional': 0.3232339,
                                                                                           'algorithm': 1.0360106,
                                                                                           'algorithms': 0.20798434,
                                                                                           'analysis': 0.25909927,
                                                                                           'analyze': 0.18533573,
                                                                                           'apache': 0.8096589,
                                                                                           'api': 1.3224775,
                                                                                           'approximate': 0.0154337585,
                                                                                           'array': 0.23401959,
                                                                                           'auto': 1.4535567,
                                                                                           'automatic': 0.7868701,
                                                                                           'availability': 0.21982048,
                                                                                           'available': 0.030020691,
                                                                                           'average': 0.098859586,
                                                                                           'basic': 0.2743477,
                                                                                           'blocking': 0.10501332,
                                                                                           'bot': 0.07765888,
                                                                                           'buffer': 0.36042303,
                                                                                           'calculate': 0.21506485,
                                                                                           'calculation': 0.81758976,
                                                                                           'capacity': 0.58354694,
                                                                                           'cassandra': 0.22208737,
                                                                                           'checkpoint': 0.031537656,
                                                                                           'chess': 0.6237735,
                                                                                           'class': 0.439471,
                                                                                           'clock': 0.54654706,
                                                                                           'cluster': 1.4933486,
                                                                                           'cod': 0.12783043,
                                                                                           'computation': 0.39954206,
                                                                                           'compute': 0.042445127,
                                                                                           'computer': 0.13797997,
                                                                                           'constant': 0.2067099,
                                                                                           'cpu': 0.5182024,
                                                                                           'crawl': 0.22104222,
                                                                                           'data': 0.51176333,
                                                                                           'database': 0.440294,
                                                                                           'determined': 0.23795621,
                                                                                           'disk': 0.5893501,
                                                                                           'e': 0.05990428,
                                                                                           'each': 0.46478215,
                                                                                           'equation': 0.008288982,
                                                                                           'er': 0.43452957,
                                                                                           'es': 0.14311427,
                                                                                           'estimate': 0.25439763,
                                                                                           'every': 0.1305604,
                                                                                           'execution': 0.7186893,
                                                                                           'exposed': 0.23602542,
                                                                                           'extra': 0.7385199,
                                                                                           'fixed': 0.11877214,
                                                                                           'forum': 0.3137529,
                                                                                           'fra': 1.0726693,
                                                                                           'fragment': 0.030604606,
                                                                                           'g': 0.026902322,
                                                                                           'gage': 0.12548852,
                                                                                           'guild': 0.27722847,
                                                                                           'handle': 0.8976072,
                                                                                           'handling': 0.69513077,
                                                                                           'heap': 0.26846212,
                                                                                           'hours': 0.7121461,
                                                                                           'http': 0.10318518,
                                                                                           'index': 1.6740144,
                                                                                           'indexed': 1.1180266,
                                                                                           'indices': 0.88624585,
                                                                                           'ing': 1.10228,
                                                                                           'integer': 0.2208937,
                                                                                           'inventory': 0.44952998,
                                                                                           'io': 0.85926545,
                                                                                           'item': 0.48019466,
                                                                                           'items': 0.7935411,
                                                                                           'java': 1.237859,
                                                                                           'lane': 0.39564016,
                                                                                           'length': 0.47680393,
                                                                                           'limit': 0.4967848,
                                                                                           'load': 1.2765044,
                                                                                           'loading': 0.25379905,
                                                                                           'm': 0.06343312,
                                                                                           'machine': 0.19301167,
                                                                                           'maintenance': 0.23043938,
                                                                                           'map': 0.07359305,
                                                                                           'mass': 0.08436136,
                                                                                           'master': 1.1724675,
                                                                                           'matching': 0.044185776,
                                                                                           'math': 0.71257645,
                                                                                           'max': 0.16343911,
                                                                                           'maximum': 0.8216195,
                                                                                           'mb': 0.74474645,
                                                                                           'measure': 0.22327076,
                                                                                           'memory': 1.4785702,
                                                                                           'metadata': 0.8341058,
                                                                                           'metric': 0.9043063,
                                                                                           'minimal': 0.36312523,
                                                                                           'minimum': 1.0762551,
                                                                                           'mining': 0.6374103,
                                                                                           'mp': 0.18194582,
                                                                                           'multi': 0.19790418,
                                                                                           'multiple': 0.08082614,
                                                                                           'n': 0.2315838,
                                                                                           'network': 0.5508067,
                                                                                           'node': 1.3963627,
                                                                                           'nodes': 0.73737425,
                                                                                           'number': 0.082121976,
                                                                                           'o': 0.11493757,
                                                                                           'object': 0.5812754,
                                                                                           'par': 0.023205614,
                                                                                           'per': 0.23101303,
                                                                                           'performance': 0.23446344,
                                                                                           'pool': 0.8049336,
                                                                                           'pools': 0.15594147,
                                                                                           'predict': 0.024841096,
                                                                                           'processing': 0.36487442,
                                                                                           'pushing': 0.20726342,
                                                                                           'q': 0.8291657,
                                                                                           'quarterly': 0.13623458,
                                                                                           'queue': 1.481917,
                                                                                           'rail': 0.078313634,
                                                                                           'ram': 0.28152135,
                                                                                           'rank': 0.3435108,
                                                                                           'ratio': 0.06241234,
                                                                                           're': 0.2784615,
                                                                                           'regional': 0.34884617,
                                                                                           'request': 0.99899644,
                                                                                           'requests': 0.99197084,
                                                                                           'requirement': 0.62241584,
                                                                                           'requirements': 0.674187,
                                                                                           'resolution': 0.02591185,
                                                                                           'routing': 0.19566713,
                                                                                           'scala': 0.17918167,
                                                                                           'scale': 0.15746343,
                                                                                           'seconds': 0.13917202,
                                                                                           'semi': 0.23686175,
                                                                                           'sequence': 0.5461212,
                                                                                           'ser': 0.08773902,
                                                                                           'serial': 0.29184434,
                                                                                           'server': 0.5091232,
                                                                                           'shards': 1.1462573,
                                                                                           'sid': 0.5460215,
                                                                                           'size': 0.5671189,
                                                                                           'small': 0.1666983,
                                                                                           'sort': 0.20719269,
                                                                                           'sql': 0.21473138,
                                                                                           'stack': 0.042597417,
                                                                                           'statistics': 0.019139726,
                                                                                           'storage': 0.11576759,
                                                                                           'strategy': 0.06358851,
                                                                                           'swarm': 0.08892168,
                                                                                           't': 0.15734711,
                                                                                           'task': 0.2625412,
                                                                                           'taylor': 0.059171513,
                                                                                           'thirty': 0.59235644,
                                                                                           'thread': 1.7254765,
                                                                                           'threads': 1.1326298,
                                                                                           'tier': 2.0103586,
                                                                                           'time': 0.5197543,
                                                                                           'times': 0.19328791,
                                                                                           'total': 0.9341554,
                                                                                           'trial': 1.0915743,
                                                                                           'ur': 0.041876547,
                                                                                           'value': 0.39162463,
                                                                                           'values': 0.10083909,
                                                                                           'wall': 0.93653333,
                                                                                           'web': 0.1397472,
                                                                                           'weeks': 0.027450949,
                                                                                           'within': 0.38789856,
                                                                                           'work': 0.1474287,
                                                                                           'workers': 0.30503651,
                                                                                           'write': 0.33134767,
                                                                                           'x': 0.027046092,
                                                                                           'z': 0.06591661,
                                                                                           'ze': 0.69916034},
                                                                            'text': 'each '
                                                                                    'thread '
                                                                                    'is '
                                                                                    'able '
                                                                                    'to '
                                                                                    'handle '
                                                                                    '150 '
                                                                                    'indexing '
                                                                                    'requests '
                                                                                    'within '
                                                                                    '30s, '
                                                                                    'and '
                                                                                    'therefore '
                                                                                    'account '
                                                                                    'for '
                                                                                    'one '
                                                                                    'extra '
                                                                                    'thread '
                                                                                    'for '
                                                                                    'each '
                                                                                    '150 '
                                                                                    'queued '
                                                                                    'items. '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'i '
                                                                                    'n '
                                                                                    'g '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    '_ '
                                                                                    's '
                                                                                    'i '
                                                                                    'z '
                                                                                    'e '
                                                                                    '× '
                                                                                    'a '
                                                                                    'v '
                                                                                    'e '
                                                                                    'r '
                                                                                    'a '
                                                                                    'g '
                                                                                    'e '
                                                                                    '_ '
                                                                                    'r '
                                                                                    'e '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    's '
                                                                                    't '
                                                                                    '_ '
                                                                                    'e '
                                                                                    'x '
                                                                                    'e '
                                                                                    'c '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    't '
                                                                                    'i '
                                                                                    'm '
                                                                                    'e '
                                                                                    '30 '
                                                                                    's '
                                                                                    '\\small '
                                                                                    'queued\\_ingestion\\_load '
                                                                                    '= '
                                                                                    '\\frac{queue\\_size '
                                                                                    '\\times '
                                                                                    'average\\_request\\_execution\\_time}{30s} '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    'd '
                                                                                    '_ '
                                                                                    'in '
                                                                                    'g '
                                                                                    'es '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    'l '
                                                                                    'o '
                                                                                    'a '
                                                                                    'd '
                                                                                    '= '
                                                                                    '30 '
                                                                                    's '
                                                                                    'q '
                                                                                    'u '
                                                                                    'e '
                                                                                    'u '
                                                                                    'e '
                                                                                    '_ '
                                                                                    's '
                                                                                    'i '
                                                                                    'ze '
                                                                                    '× '
                                                                                    'a '
                                                                                    'v '
                                                                                    'er '
                                                                                    'a '
                                                                                    'g '
                                                                                    'e '
                                                                                    '_ '
                                                                                    're '
                                                                                    'q '
                                                                                    'u '
                                                                                    'es '
                                                                                    't '
                                                                                    '_ '
                                                                                    'e '
                                                                                    'x '
                                                                                    'ec '
                                                                                    'u '
                                                                                    't '
                                                                                    'i '
                                                                                    'o '
                                                                                    'n '
                                                                                    '_ '
                                                                                    't '
                                                                                    'im '
                                                                                    'e '
                                                                                    '\u200b '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'since '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'nodes '
                                                                                    'rely '
                                                                                    'on '
                                                                                    'pushing '
                                                                                    'indexed '
                                                                                    'data '
                                                                                    'into '
                                                                                    'the '
                                                                                    'object '
                                                                                    'store '
                                                                                    'periodically, '
                                                                                    'we '
                                                                                    'do '
                                                                                    'not '
                                                                                    'need '
                                                                                    'to '
                                                                                    'scale '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier '
                                                                                    'based '
                                                                                    'on '
                                                                                    'the '
                                                                                    'total '
                                                                                    'size '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indexed '
                                                                                    'data. '
                                                                                    'However, '
                                                                                    'the '
                                                                                    'disk '
                                                                                    'IO '
                                                                                    'requirements '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload '
                                                                                    'needs '
                                                                                    'to '
                                                                                    'be '
                                                                                    'considered '
                                                                                    'for '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'decisions. '
                                                                                    'The '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'represents '
                                                                                    'both '
                                                                                    'CPU '
                                                                                    'requirements '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'nodes '
                                                                                    'as '
                                                                                    'well '
                                                                                    'as '
                                                                                    'disk '
                                                                                    'IO '
                                                                                    'since '
                                                                                    'both '
                                                                                    'CPU '
                                                                                    'and '
                                                                                    'IO '
                                                                                    'work '
                                                                                    'is '
                                                                                    'done '
                                                                                    'by '
                                                                                    'the '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'workers '
                                                                                    'and '
                                                                                    'we '
                                                                                    'rely '
                                                                                    'on '
                                                                                    'the '
                                                                                    'wall '
                                                                                    'clock '
                                                                                    'time '
                                                                                    'to '
                                                                                    'estimate '
                                                                                    'the '
                                                                                    'required '
                                                                                    'time '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'queued '
                                                                                    'requests. '
                                                                                    'Each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'calculates '
                                                                                    'its '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'publishes '
                                                                                    'this '
                                                                                    'value '
                                                                                    'to '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node '
                                                                                    'periodically. '
                                                                                    'The '
                                                                                    'master '
                                                                                    'node '
                                                                                    'serves '
                                                                                    'the '
                                                                                    'per '
                                                                                    'node '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'values '
                                                                                    'via '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'to '
                                                                                    'the '
                                                                                    'autoscaler. '
                                                                                    'Memory '
                                                                                    'The '
                                                                                    'memory '
                                                                                    'metrics '
                                                                                    'exposed '
                                                                                    'by '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'are '
                                                                                    'node '
                                                                                    'memory '
                                                                                    'and '
                                                                                    'tier '
                                                                                    'memory. '
                                                                                    'The '
                                                                                    'node '
                                                                                    'memory '
                                                                                    'represents '
                                                                                    'the '
                                                                                    'minimum '
                                                                                    'memory '
                                                                                    'requirement '
                                                                                    'for '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster. '
                                                                                    'The '
                                                                                    'tier '
                                                                                    'memory '
                                                                                    'metric '
                                                                                    'represents '
                                                                                    'the '
                                                                                    'minimum '
                                                                                    'total '
                                                                                    'memory '
                                                                                    'that '
                                                                                    'should '
                                                                                    'be '
                                                                                    'available '
                                                                                    'in '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'these '
                                                                                    'values '
                                                                                    'only '
                                                                                    'indicate '
                                                                                    'the '
                                                                                    'minimum '
                                                                                    'to '
                                                                                    'ensure '
                                                                                    'that '
                                                                                    'each '
                                                                                    'node '
                                                                                    'is '
                                                                                    'able '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'basic '
                                                                                    'indexing '
                                                                                    'workload '
                                                                                    'and '
                                                                                    'hold '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'and '
                                                                                    'indices '
                                                                                    'metadata, '
                                                                                    'while '
                                                                                    'ensuring '
                                                                                    'that '
                                                                                    'the '
                                                                                    'tier '
                                                                                    'includes '
                                                                                    'enough '
                                                                                    'nodes '
                                                                                    'to '
                                                                                    'accommodate '
                                                                                    'all '
                                                                                    'index '
                                                                                    'shards. '
                                                                                    'Node '
                                                                                    'memory '
                                                                                    'must '
                                                                                    'have '
                                                                                    'a '
                                                                                    'minimum '
                                                                                    'of '
                                                                                    '500MB '
                                                                                    'to '
                                                                                    'be '
                                                                                    'able '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'indexing '
                                                                                    'workloads '
                                                                                    ', '
                                                                                    'as '
                                                                                    'well '
                                                                                    'as '
                                                                                    'a '
                                                                                    'fixed '
                                                                                    'amount '
                                                                                    'of '
                                                                                    'memory '
                                                                                    'per '
                                                                                    'each '
                                                                                    'index '
                                                                                    '. '
                                                                                    'This '
                                                                                    'ensures '
                                                                                    'all '
                                                                                    'nodes '
                                                                                    'can '
                                                                                    'hold '
                                                                                    'metadata '
                                                                                    'for '
                                                                                    'the '
                                                                                    'cluster, '
                                                                                    'which '
                                                                                    'includes '
                                                                                    'metadata '
                                                                                    'for '
                                                                                    'every '
                                                                                    'index. '
                                                                                    'Tier '
                                                                                    'memory '
                                                                                    'is '
                                                                                    'determined '
                                                                                    'by '
                                                                                    'accounting '
                                                                                    'for '
                                                                                    'the '
                                                                                    'memory'},
                                                                           {'embeddings': {'##d': 0.055720266,
                                                                                           '##est': 0.87620574,
                                                                                           '##ging': 0.12167851,
                                                                                           '##id': 0.007303444,
                                                                                           '##ing': 1.0664626,
                                                                                           '##ion': 0.5800176,
                                                                                           '##ler': 1.1925261,
                                                                                           '##ling': 1.0163201,
                                                                                           '##load': 0.81047934,
                                                                                           '##mb': 0.41285288,
                                                                                           '##rch': 0.9021695,
                                                                                           '##rd': 1.5396098,
                                                                                           '##rds': 0.47700712,
                                                                                           '##s': 0.033316635,
                                                                                           '##sca': 1.5766962,
                                                                                           '##sea': 1.0991455,
                                                                                           '500': 0.8151243,
                                                                                           '6': 0.5519658,
                                                                                           'accounting': 0.74103206,
                                                                                           'algorithm': 1.0231093,
                                                                                           'algorithms': 0.065428115,
                                                                                           'allocated': 0.19617477,
                                                                                           'amazon': 0.31502825,
                                                                                           'analysis': 0.5597703,
                                                                                           'analyze': 0.30770445,
                                                                                           'apache': 0.8908353,
                                                                                           'api': 1.1461797,
                                                                                           'approximate': 0.21645284,
                                                                                           'archive': 0.013153568,
                                                                                           'array': 0.047213156,
                                                                                           'auto': 1.3802772,
                                                                                           'automatic': 0.7499421,
                                                                                           'availability': 0.10610637,
                                                                                           'basic': 0.5700848,
                                                                                           'blocking': 0.03154505,
                                                                                           'bot': 0.2956401,
                                                                                           'brain': 0.13824557,
                                                                                           'brick': 0.34880513,
                                                                                           'broken': 0.1587869,
                                                                                           'buffer': 0.27810082,
                                                                                           'bug': 0.019329984,
                                                                                           'cad': 0.010832788,
                                                                                           'calculate': 0.71264565,
                                                                                           'calculated': 0.19991197,
                                                                                           'calculation': 0.90854484,
                                                                                           'capacity': 0.13310817,
                                                                                           'cassandra': 0.269642,
                                                                                           'checkpoint': 0.33004454,
                                                                                           'chess': 0.6517597,
                                                                                           'class': 0.40205157,
                                                                                           'clock': 1.2123855,
                                                                                           'cluster': 1.5899432,
                                                                                           'clusters': 0.21755162,
                                                                                           'computation': 0.3360238,
                                                                                           'compute': 0.15521479,
                                                                                           'computer': 0.4586727,
                                                                                           'computers': 0.09730453,
                                                                                           'core': 0.18051882,
                                                                                           'cores': 0.54003507,
                                                                                           'cpu': 1.4255431,
                                                                                           'data': 0.7048903,
                                                                                           'database': 0.5640705,
                                                                                           'depend': 0.08640857,
                                                                                           'deploy': 0.116062716,
                                                                                           'deployed': 0.16281521,
                                                                                           'deployment': 1.375697,
                                                                                           'dev': 0.16744493,
                                                                                           'disk': 1.2671278,
                                                                                           'display': 0.10427013,
                                                                                           'done': 0.057584852,
                                                                                           'each': 0.44890955,
                                                                                           'elastic': 1.3546548,
                                                                                           'estimate': 1.1541563,
                                                                                           'estimated': 0.4820726,
                                                                                           'estimates': 0.68956727,
                                                                                           'execution': 0.025004579,
                                                                                           'expose': 0.3791655,
                                                                                           'exposed': 1.4152902,
                                                                                           'exposing': 0.2018034,
                                                                                           'exposure': 0.22712028,
                                                                                           'field': 0.43335024,
                                                                                           'fixed': 0.3727484,
                                                                                           'fragment': 0.3541149,
                                                                                           'fragments': 0.19871251,
                                                                                           'framework': 0.0067325183,
                                                                                           'gage': 0.062432837,
                                                                                           'gb': 0.23573099,
                                                                                           'guild': 0.06864197,
                                                                                           'handle': 0.6664566,
                                                                                           'handling': 0.79544353,
                                                                                           'hardware': 0.15463935,
                                                                                           'hash': 0.056183893,
                                                                                           'host': 0.49334934,
                                                                                           'hours': 0.23847345,
                                                                                           'hu': 0.12027907,
                                                                                           'index': 1.84248,
                                                                                           'indexed': 0.5543888,
                                                                                           'indices': 0.8364849,
                                                                                           'ing': 1.1731079,
                                                                                           'integration': 0.43307945,
                                                                                           'interface': 0.13424914,
                                                                                           'inventory': 0.43660846,
                                                                                           'io': 1.1710184,
                                                                                           'java': 1.1948129,
                                                                                           'kb': 0.275635,
                                                                                           'lane': 0.065143116,
                                                                                           'lang': 0.07760714,
                                                                                           'length': 0.19545008,
                                                                                           'limit': 0.14939034,
                                                                                           'load': 1.068046,
                                                                                           'loading': 0.3452746,
                                                                                           'machine': 0.28579098,
                                                                                           'maintenance': 0.24792214,
                                                                                           'management': 0.016834572,
                                                                                           'mandatory': 0.09757359,
                                                                                           'map': 0.33999705,
                                                                                           'mapped': 0.4253768,
                                                                                           'mapping': 0.7739739,
                                                                                           'master': 1.514614,
                                                                                           'math': 0.62235314,
                                                                                           'maximum': 0.4592383,
                                                                                           'mb': 0.8386821,
                                                                                           'measure': 0.35868418,
                                                                                           'memory': 1.4037786,
                                                                                           'metadata': 0.57345796,
                                                                                           'metric': 1.0478114,
                                                                                           'minimal': 0.55310273,
                                                                                           'minimum': 1.1779544,
                                                                                           'mining': 0.60987383,
                                                                                           'monitor': 0.41601682,
                                                                                           'monitoring': 0.80379987,
                                                                                           'multiple': 0.0046412363,
                                                                                           'need': 0.13691676,
                                                                                           'needs': 0.09020152,
                                                                                           'network': 0.5226748,
                                                                                           'node': 1.5207812,
                                                                                           'nodes': 0.9873411,
                                                                                           'number': 0.08917359,
                                                                                           'o': 0.47437057,
                                                                                           'open': 0.9998891,
                                                                                           'operation': 0.059715636,
                                                                                           'parameters': 0.06929999,
                                                                                           'per': 1.2698478,
                                                                                           'performance': 0.27903107,
                                                                                           'pool': 1.1343037,
                                                                                           'pools': 0.5005684,
                                                                                           'predict': 0.15172759,
                                                                                           'processing': 0.34928247,
                                                                                           'processor': 0.06942589,
                                                                                           'provided': 0.33421612,
                                                                                           'published': 0.35502988,
                                                                                           'queue': 1.4328028,
                                                                                           'ram': 0.07832895,
                                                                                           'rank': 0.09849679,
                                                                                           'regional': 0.023943441,
                                                                                           'request': 0.58130133,
                                                                                           'requests': 0.4985438,
                                                                                           'require': 0.054292977,
                                                                                           'required': 0.20457663,
                                                                                           'requirement': 0.9255918,
                                                                                           'requirements': 1.1021699,
                                                                                           'resolution': 0.2503146,
                                                                                           'resource': 0.22062841,
                                                                                           'resources': 0.7977981,
                                                                                           'scala': 0.046379413,
                                                                                           'scale': 0.34393448,
                                                                                           'scaling': 0.5871495,
                                                                                           'script': 0.07091305,
                                                                                           'search': 0.2748066,
                                                                                           'semi': 0.19345926,
                                                                                           'sequence': 0.2634719,
                                                                                           'serial': 0.281783,
                                                                                           'serve': 0.3122354,
                                                                                           'server': 0.62030464,
                                                                                           'sha': 1.412181,
                                                                                           'shards': 1.2690446,
                                                                                           'sid': 0.5395205,
                                                                                           'size': 0.37528938,
                                                                                           'software': 0.2301807,
                                                                                           'sql': 0.28173122,
                                                                                           'storage': 0.17134488,
                                                                                           'sum': 0.48667532,
                                                                                           'swarm': 0.09873215,
                                                                                           'task': 0.15503421,
                                                                                           'thread': 1.2720325,
                                                                                           'threads': 0.5098314,
                                                                                           'tier': 2.0405457,
                                                                                           'time': 0.691699,
                                                                                           'timer': 0.3272765,
                                                                                           'total': 0.853305,
                                                                                           'trial': 0.75489986,
                                                                                           'value': 0.55824566,
                                                                                           'values': 0.18979663,
                                                                                           'wall': 1.5562296,
                                                                                           'walls': 0.57668746,
                                                                                           'web': 0.12833436,
                                                                                           'workers': 0.30275372,
                                                                                           'write': 0.8986184},
                                                                            'text': 'both '
                                                                                    'CPU '
                                                                                    'requirements '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'nodes '
                                                                                    'as '
                                                                                    'well '
                                                                                    'as '
                                                                                    'disk '
                                                                                    'IO '
                                                                                    'since '
                                                                                    'both '
                                                                                    'CPU '
                                                                                    'and '
                                                                                    'IO '
                                                                                    'work '
                                                                                    'is '
                                                                                    'done '
                                                                                    'by '
                                                                                    'the '
                                                                                    'write '
                                                                                    'thread '
                                                                                    'pool '
                                                                                    'workers '
                                                                                    'and '
                                                                                    'we '
                                                                                    'rely '
                                                                                    'on '
                                                                                    'the '
                                                                                    'wall '
                                                                                    'clock '
                                                                                    'time '
                                                                                    'to '
                                                                                    'estimate '
                                                                                    'the '
                                                                                    'required '
                                                                                    'time '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'queued '
                                                                                    'requests. '
                                                                                    'Each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'calculates '
                                                                                    'its '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'publishes '
                                                                                    'this '
                                                                                    'value '
                                                                                    'to '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node '
                                                                                    'periodically. '
                                                                                    'The '
                                                                                    'master '
                                                                                    'node '
                                                                                    'serves '
                                                                                    'the '
                                                                                    'per '
                                                                                    'node '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'values '
                                                                                    'via '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'to '
                                                                                    'the '
                                                                                    'autoscaler. '
                                                                                    'Memory '
                                                                                    'The '
                                                                                    'memory '
                                                                                    'metrics '
                                                                                    'exposed '
                                                                                    'by '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'are '
                                                                                    'node '
                                                                                    'memory '
                                                                                    'and '
                                                                                    'tier '
                                                                                    'memory. '
                                                                                    'The '
                                                                                    'node '
                                                                                    'memory '
                                                                                    'represents '
                                                                                    'the '
                                                                                    'minimum '
                                                                                    'memory '
                                                                                    'requirement '
                                                                                    'for '
                                                                                    'each '
                                                                                    'indexing '
                                                                                    'node '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster. '
                                                                                    'The '
                                                                                    'tier '
                                                                                    'memory '
                                                                                    'metric '
                                                                                    'represents '
                                                                                    'the '
                                                                                    'minimum '
                                                                                    'total '
                                                                                    'memory '
                                                                                    'that '
                                                                                    'should '
                                                                                    'be '
                                                                                    'available '
                                                                                    'in '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'these '
                                                                                    'values '
                                                                                    'only '
                                                                                    'indicate '
                                                                                    'the '
                                                                                    'minimum '
                                                                                    'to '
                                                                                    'ensure '
                                                                                    'that '
                                                                                    'each '
                                                                                    'node '
                                                                                    'is '
                                                                                    'able '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'basic '
                                                                                    'indexing '
                                                                                    'workload '
                                                                                    'and '
                                                                                    'hold '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'and '
                                                                                    'indices '
                                                                                    'metadata, '
                                                                                    'while '
                                                                                    'ensuring '
                                                                                    'that '
                                                                                    'the '
                                                                                    'tier '
                                                                                    'includes '
                                                                                    'enough '
                                                                                    'nodes '
                                                                                    'to '
                                                                                    'accommodate '
                                                                                    'all '
                                                                                    'index '
                                                                                    'shards. '
                                                                                    'Node '
                                                                                    'memory '
                                                                                    'must '
                                                                                    'have '
                                                                                    'a '
                                                                                    'minimum '
                                                                                    'of '
                                                                                    '500MB '
                                                                                    'to '
                                                                                    'be '
                                                                                    'able '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'indexing '
                                                                                    'workloads '
                                                                                    ', '
                                                                                    'as '
                                                                                    'well '
                                                                                    'as '
                                                                                    'a '
                                                                                    'fixed '
                                                                                    'amount '
                                                                                    'of '
                                                                                    'memory '
                                                                                    'per '
                                                                                    'each '
                                                                                    'index '
                                                                                    '. '
                                                                                    'This '
                                                                                    'ensures '
                                                                                    'all '
                                                                                    'nodes '
                                                                                    'can '
                                                                                    'hold '
                                                                                    'metadata '
                                                                                    'for '
                                                                                    'the '
                                                                                    'cluster, '
                                                                                    'which '
                                                                                    'includes '
                                                                                    'metadata '
                                                                                    'for '
                                                                                    'every '
                                                                                    'index. '
                                                                                    'Tier '
                                                                                    'memory '
                                                                                    'is '
                                                                                    'determined '
                                                                                    'by '
                                                                                    'accounting '
                                                                                    'for '
                                                                                    'the '
                                                                                    'memory '
                                                                                    'overhead '
                                                                                    'of '
                                                                                    'the '
                                                                                    'field '
                                                                                    'mappings '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indices '
                                                                                    'and '
                                                                                    'the '
                                                                                    'amount '
                                                                                    'of '
                                                                                    'memory '
                                                                                    'needed '
                                                                                    'for '
                                                                                    'each '
                                                                                    'open '
                                                                                    'shard '
                                                                                    'allocated '
                                                                                    'on '
                                                                                    'a '
                                                                                    'node '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster. '
                                                                                    'Currently, '
                                                                                    'the '
                                                                                    'per-shard '
                                                                                    'memory '
                                                                                    'requirement '
                                                                                    'uses '
                                                                                    'a '
                                                                                    'fixed '
                                                                                    'estimate '
                                                                                    'of '
                                                                                    '6MB. '
                                                                                    'We '
                                                                                    'plan '
                                                                                    'to '
                                                                                    'refine '
                                                                                    'this '
                                                                                    'value. '
                                                                                    'The '
                                                                                    'estimate '
                                                                                    'for '
                                                                                    'the '
                                                                                    'memory '
                                                                                    'requirements '
                                                                                    'for '
                                                                                    'the '
                                                                                    'mappings '
                                                                                    'of '
                                                                                    'each '
                                                                                    'index '
                                                                                    'is '
                                                                                    'calculated '
                                                                                    'by '
                                                                                    'one '
                                                                                    'of '
                                                                                    'the '
                                                                                    'data '
                                                                                    'nodes '
                                                                                    'that '
                                                                                    'hosts '
                                                                                    'a '
                                                                                    'shard '
                                                                                    'of '
                                                                                    'the '
                                                                                    'index. '
                                                                                    'The '
                                                                                    'calculated '
                                                                                    'estimates '
                                                                                    'are '
                                                                                    'sent '
                                                                                    'to '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node. '
                                                                                    'Whenever '
                                                                                    'there '
                                                                                    'is '
                                                                                    'a '
                                                                                    'mapping '
                                                                                    'change '
                                                                                    'this '
                                                                                    'estimate '
                                                                                    'is '
                                                                                    'updated '
                                                                                    'and '
                                                                                    'published '
                                                                                    'to '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node '
                                                                                    'again. '
                                                                                    'The '
                                                                                    'master '
                                                                                    'node '
                                                                                    'serves '
                                                                                    'the '
                                                                                    'node '
                                                                                    'and '
                                                                                    'total '
                                                                                    'memory '
                                                                                    'metrics '
                                                                                    'based '
                                                                                    'on '
                                                                                    'these '
                                                                                    'information '
                                                                                    'via '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'to '
                                                                                    'the '
                                                                                    'autoscaler. '
                                                                                    'Scaling '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'The '
                                                                                    'autoscaler '
                                                                                    'is '
                                                                                    'responsible '
                                                                                    'for '
                                                                                    'monitoring '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'via '
                                                                                    'the '
                                                                                    'exposed '
                                                                                    'metrics, '
                                                                                    'calculating '
                                                                                    'the '
                                                                                    'desirable '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'to '
                                                                                    'adapt '
                                                                                    'to '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload, '
                                                                                    'and '
                                                                                    'updating '
                                                                                    'the '
                                                                                    'deployment '
                                                                                    'accordingly. '
                                                                                    'This '
                                                                                    'is '
                                                                                    'done '
                                                                                    'by '
                                                                                    'calculating '
                                                                                    'the '
                                                                                    'total '
                                                                                    'required '
                                                                                    'CPU '
                                                                                    'and '
                                                                                    'memory '
                                                                                    'resources '
                                                                                    'based '
                                                                                    'on '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'memory '
                                                                                    'metrics. '
                                                                                    'The '
                                                                                    'sum '
                                                                                    'of '
                                                                                    'all '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'per '
                                                                                    'node '
                                                                                    'values '
                                                                                    'determines '
                                                                                    'the '
                                                                                    'total '
                                                                                    'number '
                                                                                    'of '
                                                                                    'CPU '
                                                                                    'cores '
                                                                                    'needed '
                                                                                    'for '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier. '
                                                                                    'The '
                                                                                    'calculated '
                                                                                    'CPU '
                                                                                    'requirement '
                                                                                    'and '
                                                                                    'the '
                                                                                    'provided '
                                                                                    'minimum '
                                                                                    'node '
                                                                                    'and '
                                                                                    'tier '
                                                                                    'memory '
                                                                                    'resources '
                                                                                    'are '
                                                                                    'mapped '
                                                                                    'to '
                                                                                    'a '
                                                                                    'predetermined '
                                                                                    'set'},
                                                                           {'embeddings': {'##ber': 0.9460652,
                                                                                           '##d': 0.10023495,
                                                                                           '##es': 0.14341043,
                                                                                           '##gb': 0.6906553,
                                                                                           '##ine': 0.9458122,
                                                                                           '##ing': 0.42145026,
                                                                                           '##ler': 1.2356958,
                                                                                           '##ling': 0.63835293,
                                                                                           '##load': 0.2904571,
                                                                                           '##mb': 0.6970242,
                                                                                           '##net': 0.7010928,
                                                                                           '##pu': 1.0257086,
                                                                                           '##rch': 1.0700952,
                                                                                           '##rd': 1.6493205,
                                                                                           '##rds': 0.6754141,
                                                                                           '##rt': 0.12942569,
                                                                                           '##sca': 1.4853197,
                                                                                           '##sea': 1.4192088,
                                                                                           '##vc': 1.405061,
                                                                                           '100': 0.26849923,
                                                                                           '16': 0.19268984,
                                                                                           '160': 0.2302431,
                                                                                           '1600': 0.8732733,
                                                                                           '32': 1.2120824,
                                                                                           '6': 0.70548016,
                                                                                           '64': 1.202607,
                                                                                           'algorithm': 0.937971,
                                                                                           'allocated': 0.73692024,
                                                                                           'allocation': 0.4625666,
                                                                                           'amazon': 0.86137766,
                                                                                           'analysis': 0.58160084,
                                                                                           'analyze': 0.023657316,
                                                                                           'apache': 0.85805637,
                                                                                           'api': 0.9369967,
                                                                                           'approximate': 0.15172462,
                                                                                           'auto': 1.225151,
                                                                                           'automatic': 0.7224918,
                                                                                           'availability': 0.3053787,
                                                                                           'bot': 0.33649588,
                                                                                           'brick': 0.28021842,
                                                                                           'buffer': 0.27807808,
                                                                                           'bug': 0.12689802,
                                                                                           'calculate': 0.56475216,
                                                                                           'calculated': 0.2805605,
                                                                                           'calculating': 0.18157567,
                                                                                           'calculation': 1.0562031,
                                                                                           'capacity': 0.19689727,
                                                                                           'certification': 0.030283952,
                                                                                           'checkpoint': 0.1251825,
                                                                                           'chess': 0.38721076,
                                                                                           'class': 0.044428803,
                                                                                           'closed': 0.20298174,
                                                                                           'cluster': 1.8217679,
                                                                                           'clusters': 0.40412048,
                                                                                           'computation': 0.27228907,
                                                                                           'compute': 0.157462,
                                                                                           'computer': 0.07424284,
                                                                                           'cores': 0.28018573,
                                                                                           'cpu': 0.874331,
                                                                                           'criteria': 0.20424062,
                                                                                           'cube': 0.078070216,
                                                                                           'currently': 0.26391146,
                                                                                           'data': 0.57366157,
                                                                                           'database': 0.5346718,
                                                                                           'deploy': 0.31853938,
                                                                                           'deployed': 0.23235346,
                                                                                           'deployment': 1.38996,
                                                                                           'desirable': 0.25084683,
                                                                                           'desired': 0.05757945,
                                                                                           'determine': 0.07967118,
                                                                                           'determined': 0.38774973,
                                                                                           'dimensions': 0.3834306,
                                                                                           'disk': 0.7686433,
                                                                                           'display': 0.044948753,
                                                                                           'domain': 0.05484484,
                                                                                           'each': 0.026949435,
                                                                                           'elastic': 1.7217911,
                                                                                           'equation': 0.07899539,
                                                                                           'estimate': 1.0816743,
                                                                                           'estimated': 0.2908085,
                                                                                           'estimates': 0.7743369,
                                                                                           'existing': 0.50358754,
                                                                                           'exposed': 0.91814655,
                                                                                           'field': 1.4176838,
                                                                                           'fields': 0.56111515,
                                                                                           'fixed': 0.653671,
                                                                                           'forest': 0.088545434,
                                                                                           'gage': 0.23066506,
                                                                                           'gb': 0.7216355,
                                                                                           'hardware': 0.5457616,
                                                                                           'honey': 0.13710178,
                                                                                           'host': 0.32896483,
                                                                                           'hu': 0.022061992,
                                                                                           'implement': 0.19801763,
                                                                                           'index': 1.5813339,
                                                                                           'indexed': 0.33440682,
                                                                                           'indicator': 0.07646061,
                                                                                           'indices': 1.0497515,
                                                                                           'ing': 0.44711637,
                                                                                           'integration': 0.38794386,
                                                                                           'inventory': 0.55072165,
                                                                                           'java': 1.0091366,
                                                                                           'kb': 0.31603098,
                                                                                           'ku': 1.2214607,
                                                                                           'largest': 0.55517995,
                                                                                           'length': 0.1961873,
                                                                                           'limit': 0.12602727,
                                                                                           'linear': 0.13019355,
                                                                                           'load': 0.7046929,
                                                                                           'map': 0.6723943,
                                                                                           'mapped': 0.6155787,
                                                                                           'mapping': 0.95820665,
                                                                                           'maps': 0.19839133,
                                                                                           'master': 1.3583598,
                                                                                           'math': 0.52316844,
                                                                                           'maximum': 0.17016214,
                                                                                           'mb': 0.8793483,
                                                                                           'measure': 0.37326512,
                                                                                           'memory': 1.3331418,
                                                                                           'metric': 0.9261499,
                                                                                           'minimum': 0.4176075,
                                                                                           'mining': 0.42999497,
                                                                                           'monitor': 0.34513482,
                                                                                           'monitoring': 0.6307714,
                                                                                           'multi': 0.3034215,
                                                                                           'network': 0.67814016,
                                                                                           'node': 1.2861586,
                                                                                           'nodes': 0.6710798,
                                                                                           'open': 1.3986069,
                                                                                           'optimal': 0.0624708,
                                                                                           'overhead': 0.69991654,
                                                                                           'parameters': 0.11732358,
                                                                                           'pattern': 0.005440311,
                                                                                           'per': 1.2889819,
                                                                                           'performance': 0.14103872,
                                                                                           'poll': 0.52450436,
                                                                                           'polling': 0.3777002,
                                                                                           'polls': 0.60389787,
                                                                                           'predict': 0.038165692,
                                                                                           'published': 0.06970011,
                                                                                           'radar': 0.004892402,
                                                                                           'ram': 0.1705884,
                                                                                           'rank': 0.1464829,
                                                                                           'ratio': 0.6063533,
                                                                                           'reconciliation': 0.4469912,
                                                                                           'ref': 0.5476266,
                                                                                           'requirement': 0.92776734,
                                                                                           'requirements': 1.1151919,
                                                                                           'resolution': 0.34558743,
                                                                                           'resource': 0.21023308,
                                                                                           'resources': 0.925664,
                                                                                           'scale': 1.1254972,
                                                                                           'scaled': 0.25958243,
                                                                                           'scaling': 1.3571583,
                                                                                           'scope': 0.007439173,
                                                                                           'script': 0.108936414,
                                                                                           'search': 0.4840181,
                                                                                           'serial': 0.38776705,
                                                                                           'server': 0.36229628,
                                                                                           'sha': 1.6222633,
                                                                                           'sid': 0.4845318,
                                                                                           'since': 0.0958648,
                                                                                           'size': 1.1212213,
                                                                                           'sizes': 0.8831621,
                                                                                           'software': 0.10655975,
                                                                                           'sort': 0.23242046,
                                                                                           'specification': 0.36318856,
                                                                                           'specifications': 0.36570984,
                                                                                           'storage': 0.16639474,
                                                                                           'swarm': 0.012647891,
                                                                                           'target': 0.097013876,
                                                                                           'tier': 1.3347368,
                                                                                           'total': 0.2700686,
                                                                                           'trial': 0.48382765,
                                                                                           'up': 0.009041203,
                                                                                           'value': 0.5148574,
                                                                                           'version': 0.00331044,
                                                                                           'vote': 0.19521642,
                                                                                           'voting': 0.32694972,
                                                                                           'web': 0.43445045,
                                                                                           'which': 0.22146864},
                                                                            'text': 'overhead '
                                                                                    'of '
                                                                                    'the '
                                                                                    'field '
                                                                                    'mappings '
                                                                                    'of '
                                                                                    'the '
                                                                                    'indices '
                                                                                    'and '
                                                                                    'the '
                                                                                    'amount '
                                                                                    'of '
                                                                                    'memory '
                                                                                    'needed '
                                                                                    'for '
                                                                                    'each '
                                                                                    'open '
                                                                                    'shard '
                                                                                    'allocated '
                                                                                    'on '
                                                                                    'a '
                                                                                    'node '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster. '
                                                                                    'Currently, '
                                                                                    'the '
                                                                                    'per-shard '
                                                                                    'memory '
                                                                                    'requirement '
                                                                                    'uses '
                                                                                    'a '
                                                                                    'fixed '
                                                                                    'estimate '
                                                                                    'of '
                                                                                    '6MB. '
                                                                                    'We '
                                                                                    'plan '
                                                                                    'to '
                                                                                    'refine '
                                                                                    'this '
                                                                                    'value. '
                                                                                    'The '
                                                                                    'estimate '
                                                                                    'for '
                                                                                    'the '
                                                                                    'memory '
                                                                                    'requirements '
                                                                                    'for '
                                                                                    'the '
                                                                                    'mappings '
                                                                                    'of '
                                                                                    'each '
                                                                                    'index '
                                                                                    'is '
                                                                                    'calculated '
                                                                                    'by '
                                                                                    'one '
                                                                                    'of '
                                                                                    'the '
                                                                                    'data '
                                                                                    'nodes '
                                                                                    'that '
                                                                                    'hosts '
                                                                                    'a '
                                                                                    'shard '
                                                                                    'of '
                                                                                    'the '
                                                                                    'index. '
                                                                                    'The '
                                                                                    'calculated '
                                                                                    'estimates '
                                                                                    'are '
                                                                                    'sent '
                                                                                    'to '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node. '
                                                                                    'Whenever '
                                                                                    'there '
                                                                                    'is '
                                                                                    'a '
                                                                                    'mapping '
                                                                                    'change '
                                                                                    'this '
                                                                                    'estimate '
                                                                                    'is '
                                                                                    'updated '
                                                                                    'and '
                                                                                    'published '
                                                                                    'to '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node '
                                                                                    'again. '
                                                                                    'The '
                                                                                    'master '
                                                                                    'node '
                                                                                    'serves '
                                                                                    'the '
                                                                                    'node '
                                                                                    'and '
                                                                                    'total '
                                                                                    'memory '
                                                                                    'metrics '
                                                                                    'based '
                                                                                    'on '
                                                                                    'these '
                                                                                    'information '
                                                                                    'via '
                                                                                    'the '
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'to '
                                                                                    'the '
                                                                                    'autoscaler. '
                                                                                    'Scaling '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'The '
                                                                                    'autoscaler '
                                                                                    'is '
                                                                                    'responsible '
                                                                                    'for '
                                                                                    'monitoring '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'via '
                                                                                    'the '
                                                                                    'exposed '
                                                                                    'metrics, '
                                                                                    'calculating '
                                                                                    'the '
                                                                                    'desirable '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'to '
                                                                                    'adapt '
                                                                                    'to '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload, '
                                                                                    'and '
                                                                                    'updating '
                                                                                    'the '
                                                                                    'deployment '
                                                                                    'accordingly. '
                                                                                    'This '
                                                                                    'is '
                                                                                    'done '
                                                                                    'by '
                                                                                    'calculating '
                                                                                    'the '
                                                                                    'total '
                                                                                    'required '
                                                                                    'CPU '
                                                                                    'and '
                                                                                    'memory '
                                                                                    'resources '
                                                                                    'based '
                                                                                    'on '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'and '
                                                                                    'memory '
                                                                                    'metrics. '
                                                                                    'The '
                                                                                    'sum '
                                                                                    'of '
                                                                                    'all '
                                                                                    'the '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'per '
                                                                                    'node '
                                                                                    'values '
                                                                                    'determines '
                                                                                    'the '
                                                                                    'total '
                                                                                    'number '
                                                                                    'of '
                                                                                    'CPU '
                                                                                    'cores '
                                                                                    'needed '
                                                                                    'for '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier. '
                                                                                    'The '
                                                                                    'calculated '
                                                                                    'CPU '
                                                                                    'requirement '
                                                                                    'and '
                                                                                    'the '
                                                                                    'provided '
                                                                                    'minimum '
                                                                                    'node '
                                                                                    'and '
                                                                                    'tier '
                                                                                    'memory '
                                                                                    'resources '
                                                                                    'are '
                                                                                    'mapped '
                                                                                    'to '
                                                                                    'a '
                                                                                    'predetermined '
                                                                                    'set '
                                                                                    'of '
                                                                                    'cluster '
                                                                                    'sizes. '
                                                                                    'Each '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'determines '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'nodes '
                                                                                    'and '
                                                                                    'the '
                                                                                    'CPU, '
                                                                                    'memory '
                                                                                    'and '
                                                                                    'disk '
                                                                                    'size '
                                                                                    'of '
                                                                                    'each '
                                                                                    'node. '
                                                                                    'All '
                                                                                    'nodes '
                                                                                    'within '
                                                                                    'a '
                                                                                    'certain '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'have '
                                                                                    'the '
                                                                                    'same '
                                                                                    'hardware '
                                                                                    'specification. '
                                                                                    'There '
                                                                                    'is '
                                                                                    'a '
                                                                                    'fixed '
                                                                                    'ratio '
                                                                                    'between '
                                                                                    'CPU, '
                                                                                    'memory '
                                                                                    'and '
                                                                                    'disk, '
                                                                                    'thus '
                                                                                    'always '
                                                                                    'scaling '
                                                                                    'all '
                                                                                    '3 '
                                                                                    'resources '
                                                                                    'linearly. '
                                                                                    'The '
                                                                                    'existing '
                                                                                    'cluster '
                                                                                    'sizes '
                                                                                    'for '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier '
                                                                                    'are '
                                                                                    'based '
                                                                                    'on '
                                                                                    'node '
                                                                                    'sizes '
                                                                                    'starting '
                                                                                    'from '
                                                                                    '4GB/2vCPU/100GB '
                                                                                    'disk '
                                                                                    'to '
                                                                                    '64GB/32vCPU/1600GB '
                                                                                    'disk. '
                                                                                    'Once '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'scales '
                                                                                    'up '
                                                                                    'to '
                                                                                    'the '
                                                                                    'largest '
                                                                                    'node '
                                                                                    'size '
                                                                                    '(64GB '
                                                                                    'memory), '
                                                                                    'any '
                                                                                    'further '
                                                                                    'scale-up '
                                                                                    'adds '
                                                                                    'new '
                                                                                    '64GB '
                                                                                    'nodes, '
                                                                                    'allowing '
                                                                                    'a '
                                                                                    'cluster '
                                                                                    'to '
                                                                                    'scale '
                                                                                    'up '
                                                                                    'to '
                                                                                    '32 '
                                                                                    'nodes '
                                                                                    'of '
                                                                                    '64GB. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'this '
                                                                                    'is '
                                                                                    'not '
                                                                                    'a '
                                                                                    'hard '
                                                                                    'upper '
                                                                                    'bound '
                                                                                    'on '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'Elasticsearch '
                                                                                    'nodes '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'and '
                                                                                    'can '
                                                                                    'be '
                                                                                    'increased '
                                                                                    'if '
                                                                                    'necessary. '
                                                                                    'Every '
                                                                                    '5 '
                                                                                    'seconds '
                                                                                    'the '
                                                                                    'autoscaler '
                                                                                    'polls '
                                                                                    'metrics '
                                                                                    'from '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node, '
                                                                                    'calculates '
                                                                                    'the '
                                                                                    'desirable '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'and '
                                                                                    'if '
                                                                                    'it '
                                                                                    'is '
                                                                                    'different '
                                                                                    'from '
                                                                                    'the '
                                                                                    'current '
                                                                                    'cluster '
                                                                                    'size, '
                                                                                    'it '
                                                                                    'updates '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'Kubernetes '
                                                                                    'Deployment '
                                                                                    'accordingly. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'the '
                                                                                    'actual '
                                                                                    'reconciliation '
                                                                                    'of '
                                                                                    'the '
                                                                                    'deployment '
                                                                                    'towards '
                                                                                    'the '
                                                                                    'desired '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'and '
                                                                                    'adding '
                                                                                    'and '
                                                                                    'removing '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'nodes '
                                                                                    'to '
                                                                                    'achieve '
                                                                                    'this '
                                                                                    'is '
                                                                                    'done '
                                                                                    'by '
                                                                                    'Kubernetes. '
                                                                                    'In '
                                                                                    'order '
                                                                                    'to '
                                                                                    'avoid '
                                                                                    'very '
                                                                                    'short-lived '
                                                                                    'changes '
                                                                                    'to '
                                                                                    'the'},
                                                                           {'embeddings': {'##ber': 0.03804658,
                                                                                           '##es': 0.1512185,
                                                                                           '##gb': 0.6443679,
                                                                                           '##hi': 0.36000288,
                                                                                           '##ika': 0.07467539,
                                                                                           '##ing': 0.6129379,
                                                                                           '##ler': 1.1574837,
                                                                                           '##less': 0.5735957,
                                                                                           '##ling': 1.1661593,
                                                                                           '##load': 0.62337583,
                                                                                           '##net': 0.58226395,
                                                                                           '##oya': 1.7074469,
                                                                                           '##pu': 1.1345644,
                                                                                           '##rch': 1.0119687,
                                                                                           '##sca': 1.5153302,
                                                                                           '##sea': 1.4253823,
                                                                                           '##vc': 1.4631956,
                                                                                           '100': 0.55265766,
                                                                                           '15': 0.052379817,
                                                                                           '16': 0.33394203,
                                                                                           '160': 0.118766,
                                                                                           '1600': 0.8028694,
                                                                                           '32': 1.1772103,
                                                                                           '4': 0.16181825,
                                                                                           '64': 1.4588842,
                                                                                           'algorithm': 0.94727564,
                                                                                           'always': 0.38941032,
                                                                                           'amazon': 0.89331883,
                                                                                           'analysis': 0.4050502,
                                                                                           'analyze': 0.023668261,
                                                                                           'andersen': 0.49676144,
                                                                                           'apache': 0.80054885,
                                                                                           'ariel': 0.4422102,
                                                                                           'auto': 1.2729144,
                                                                                           'automatic': 0.7698037,
                                                                                           'automatically': 0.04643825,
                                                                                           'availability': 0.49544457,
                                                                                           'available': 0.19981025,
                                                                                           'blog': 0.50581634,
                                                                                           'boat': 0.4211383,
                                                                                           'bot': 0.44343898,
                                                                                           'bug': 0.16439897,
                                                                                           'calculate': 0.44946215,
                                                                                           'calculating': 0.21078831,
                                                                                           'calculation': 0.91136605,
                                                                                           'calculations': 0.35172287,
                                                                                           'capacity': 0.32551798,
                                                                                           'certification': 0.96537966,
                                                                                           'certified': 0.86568826,
                                                                                           'change': 0.091490604,
                                                                                           'checkpoint': 0.13703609,
                                                                                           'chess': 0.30361477,
                                                                                           'class': 0.12189255,
                                                                                           'cloud': 0.36273655,
                                                                                           'cluster': 2.1554685,
                                                                                           'clusters': 0.84253734,
                                                                                           'competition': 0.0070358375,
                                                                                           'component': 0.16093102,
                                                                                           'components': 0.688979,
                                                                                           'computation': 0.0109849,
                                                                                           'computer': 0.37449652,
                                                                                           'computers': 0.29611063,
                                                                                           'constant': 0.21192689,
                                                                                           'cpu': 0.9483953,
                                                                                           'crawl': 0.061979044,
                                                                                           'data': 0.29847682,
                                                                                           'database': 0.53361094,
                                                                                           'define': 0.30592072,
                                                                                           'deployment': 1.1050912,
                                                                                           'desirable': 0.28776327,
                                                                                           'determination': 0.25265238,
                                                                                           'determine': 0.4538456,
                                                                                           'determined': 0.5666302,
                                                                                           'determines': 0.02666208,
                                                                                           'dimensions': 0.43506965,
                                                                                           'disadvantage': 0.40544793,
                                                                                           'disk': 1.0043706,
                                                                                           'domain': 0.08386699,
                                                                                           'down': 1.1079221,
                                                                                           'each': 0.20502539,
                                                                                           'elastic': 2.0313072,
                                                                                           'engineer': 0.41261968,
                                                                                           'engineering': 0.43656224,
                                                                                           'existing': 0.82118076,
                                                                                           'expensive': 0.10213457,
                                                                                           'factors': 0.04067958,
                                                                                           'fernandez': 1.1611929,
                                                                                           'fixed': 0.6458474,
                                                                                           'forest': 0.07132318,
                                                                                           'francisco': 1.0563725,
                                                                                           'garcia': 0.13344267,
                                                                                           'gb': 0.6862939,
                                                                                           'global': 0.0054082987,
                                                                                           'hardware': 0.7944886,
                                                                                           'hen': 0.9853478,
                                                                                           'honey': 0.081156164,
                                                                                           'hour': 0.0074544367,
                                                                                           'hours': 0.24539681,
                                                                                           'hu': 0.06941744,
                                                                                           'implement': 0.23772681,
                                                                                           'implementation': 0.07986039,
                                                                                           'improve': 0.2981144,
                                                                                           'increase': 0.7570058,
                                                                                           'increasing': 0.25063965,
                                                                                           'index': 1.358504,
                                                                                           'indexed': 0.29916498,
                                                                                           'ing': 0.49232894,
                                                                                           'integration': 0.20372295,
                                                                                           'inventory': 0.49392712,
                                                                                           'java': 0.96544707,
                                                                                           'jose': 0.014233379,
                                                                                           'ku': 1.0064884,
                                                                                           'large': 0.009199611,
                                                                                           'largest': 0.5853634,
                                                                                           'latest': 0.075750045,
                                                                                           'learning': 0.14278692,
                                                                                           'length': 0.2575359,
                                                                                           'limit': 0.27284575,
                                                                                           'linear': 0.99686086,
                                                                                           'load': 0.78078943,
                                                                                           'loading': 0.09809506,
                                                                                           'log': 0.053032227,
                                                                                           'lopez': 0.37077188,
                                                                                           'machine': 0.1154489,
                                                                                           'maintenance': 0.24795005,
                                                                                           'management': 0.28454626,
                                                                                           'map': 0.12368915,
                                                                                           'master': 1.0599743,
                                                                                           'math': 0.39245087,
                                                                                           'maximum': 0.37043598,
                                                                                           'mb': 0.65867126,
                                                                                           'measure': 0.401138,
                                                                                           'mechanism': 0.5363481,
                                                                                           'memory': 1.0781962,
                                                                                           'metric': 0.9361899,
                                                                                           'mining': 0.4610803,
                                                                                           'minute': 0.7122368,
                                                                                           'minutes': 0.03330799,
                                                                                           'multiple': 0.28440112,
                                                                                           'network': 0.70334154,
                                                                                           'new': 0.36585885,
                                                                                           'node': 1.1508181,
                                                                                           'nodes': 0.6786249,
                                                                                           'number': 0.46848533,
                                                                                           'online': 0.10060778,
                                                                                           'operation': 0.013929884,
                                                                                           'optimal': 0.052087568,
                                                                                           'overhead': 0.12910955,
                                                                                           'performance': 0.10508823,
                                                                                           'po': 0.030801829,
                                                                                           'poll': 0.032789562,
                                                                                           'polling': 0.08606442,
                                                                                           'polls': 0.31255096,
                                                                                           'predict': 0.038815167,
                                                                                           'process': 0.32648584,
                                                                                           'processing': 0.13010792,
                                                                                           'quan': 0.30870175,
                                                                                           'rank': 0.23912333,
                                                                                           'ratio': 1.1149174,
                                                                                           'ratios': 0.17480499,
                                                                                           'ready': 0.7220055,
                                                                                           'reconciliation': 0.03476886,
                                                                                           'reduce': 0.48650545,
                                                                                           'regulation': 0.14490134,
                                                                                           'requirements': 0.26383802,
                                                                                           'resource': 0.48044914,
                                                                                           'resources': 0.99925154,
                                                                                           'sale': 0.23320372,
                                                                                           'same': 0.04602473,
                                                                                           'scala': 0.34763098,
                                                                                           'scale': 1.3520039,
                                                                                           'scaled': 0.373489,
                                                                                           'scales': 0.23150739,
                                                                                           'scaling': 1.3547646,
                                                                                           'scope': 0.24351352,
                                                                                           'sea': 0.012636473,
                                                                                           'search': 0.5437506,
                                                                                           'seconds': 0.21717648,
                                                                                           'serial': 0.084758565,
                                                                                           'server': 0.66100806,
                                                                                           'si': 0.13631321,
                                                                                           'sid': 0.4065147,
                                                                                           'size': 1.4813008,
                                                                                           'sizes': 1.1315687,
                                                                                           'software': 0.053653706,
                                                                                           'sort': 0.34857363,
                                                                                           'specification': 0.47748893,
                                                                                           'specifications': 0.54209507,
                                                                                           'square': 0.0464906,
                                                                                           'storage': 0.2826658,
                                                                                           'strategy': 0.105019435,
                                                                                           'swarm': 0.08799058,
                                                                                           'three': 0.0456386,
                                                                                           'tier': 1.2590698,
                                                                                           'torre': 0.033106416,
                                                                                           'total': 0.15115097,
                                                                                           'trainer': 0.28730983,
                                                                                           'training': 0.91525143,
                                                                                           'trial': 0.40092948,
                                                                                           'unit': 0.12670164,
                                                                                           'up': 0.48489103,
                                                                                           'user': 0.5006898,
                                                                                           'users': 0.35868,
                                                                                           'vote': 0.16288216,
                                                                                           'voting': 0.2478986,
                                                                                           'web': 0.44947043},
                                                                            'text': 'of '
                                                                                    'cluster '
                                                                                    'sizes. '
                                                                                    'Each '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'determines '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'nodes '
                                                                                    'and '
                                                                                    'the '
                                                                                    'CPU, '
                                                                                    'memory '
                                                                                    'and '
                                                                                    'disk '
                                                                                    'size '
                                                                                    'of '
                                                                                    'each '
                                                                                    'node. '
                                                                                    'All '
                                                                                    'nodes '
                                                                                    'within '
                                                                                    'a '
                                                                                    'certain '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'have '
                                                                                    'the '
                                                                                    'same '
                                                                                    'hardware '
                                                                                    'specification. '
                                                                                    'There '
                                                                                    'is '
                                                                                    'a '
                                                                                    'fixed '
                                                                                    'ratio '
                                                                                    'between '
                                                                                    'CPU, '
                                                                                    'memory '
                                                                                    'and '
                                                                                    'disk, '
                                                                                    'thus '
                                                                                    'always '
                                                                                    'scaling '
                                                                                    'all '
                                                                                    '3 '
                                                                                    'resources '
                                                                                    'linearly. '
                                                                                    'The '
                                                                                    'existing '
                                                                                    'cluster '
                                                                                    'sizes '
                                                                                    'for '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'tier '
                                                                                    'are '
                                                                                    'based '
                                                                                    'on '
                                                                                    'node '
                                                                                    'sizes '
                                                                                    'starting '
                                                                                    'from '
                                                                                    '4GB/2vCPU/100GB '
                                                                                    'disk '
                                                                                    'to '
                                                                                    '64GB/32vCPU/1600GB '
                                                                                    'disk. '
                                                                                    'Once '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'scales '
                                                                                    'up '
                                                                                    'to '
                                                                                    'the '
                                                                                    'largest '
                                                                                    'node '
                                                                                    'size '
                                                                                    '(64GB '
                                                                                    'memory), '
                                                                                    'any '
                                                                                    'further '
                                                                                    'scale-up '
                                                                                    'adds '
                                                                                    'new '
                                                                                    '64GB '
                                                                                    'nodes, '
                                                                                    'allowing '
                                                                                    'a '
                                                                                    'cluster '
                                                                                    'to '
                                                                                    'scale '
                                                                                    'up '
                                                                                    'to '
                                                                                    '32 '
                                                                                    'nodes '
                                                                                    'of '
                                                                                    '64GB. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'this '
                                                                                    'is '
                                                                                    'not '
                                                                                    'a '
                                                                                    'hard '
                                                                                    'upper '
                                                                                    'bound '
                                                                                    'on '
                                                                                    'the '
                                                                                    'number '
                                                                                    'of '
                                                                                    'Elasticsearch '
                                                                                    'nodes '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'and '
                                                                                    'can '
                                                                                    'be '
                                                                                    'increased '
                                                                                    'if '
                                                                                    'necessary. '
                                                                                    'Every '
                                                                                    '5 '
                                                                                    'seconds '
                                                                                    'the '
                                                                                    'autoscaler '
                                                                                    'polls '
                                                                                    'metrics '
                                                                                    'from '
                                                                                    'the '
                                                                                    'master '
                                                                                    'node, '
                                                                                    'calculates '
                                                                                    'the '
                                                                                    'desirable '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'and '
                                                                                    'if '
                                                                                    'it '
                                                                                    'is '
                                                                                    'different '
                                                                                    'from '
                                                                                    'the '
                                                                                    'current '
                                                                                    'cluster '
                                                                                    'size, '
                                                                                    'it '
                                                                                    'updates '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'Kubernetes '
                                                                                    'Deployment '
                                                                                    'accordingly. '
                                                                                    'Note '
                                                                                    'that '
                                                                                    'the '
                                                                                    'actual '
                                                                                    'reconciliation '
                                                                                    'of '
                                                                                    'the '
                                                                                    'deployment '
                                                                                    'towards '
                                                                                    'the '
                                                                                    'desired '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'and '
                                                                                    'adding '
                                                                                    'and '
                                                                                    'removing '
                                                                                    'the '
                                                                                    'Elasticsearch '
                                                                                    'nodes '
                                                                                    'to '
                                                                                    'achieve '
                                                                                    'this '
                                                                                    'is '
                                                                                    'done '
                                                                                    'by '
                                                                                    'Kubernetes. '
                                                                                    'In '
                                                                                    'order '
                                                                                    'to '
                                                                                    'avoid '
                                                                                    'very '
                                                                                    'short-lived '
                                                                                    'changes '
                                                                                    'to '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'size, '
                                                                                    'we '
                                                                                    'account '
                                                                                    'for '
                                                                                    'a '
                                                                                    '10% '
                                                                                    'headroom '
                                                                                    'when '
                                                                                    'calculating '
                                                                                    'the '
                                                                                    'desired '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'during '
                                                                                    'a '
                                                                                    'scale '
                                                                                    'down '
                                                                                    'and '
                                                                                    'a '
                                                                                    'scale '
                                                                                    'down '
                                                                                    'takes '
                                                                                    'effect '
                                                                                    'only '
                                                                                    'if '
                                                                                    'all '
                                                                                    'desired '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'calculations '
                                                                                    'within '
                                                                                    'the '
                                                                                    'past '
                                                                                    '15 '
                                                                                    'minute '
                                                                                    'have '
                                                                                    'indicated '
                                                                                    'a '
                                                                                    'scale-down. '
                                                                                    'Currently, '
                                                                                    'the '
                                                                                    'time '
                                                                                    'that '
                                                                                    'it '
                                                                                    'takes '
                                                                                    'for '
                                                                                    'an '
                                                                                    'increase '
                                                                                    'in '
                                                                                    'the '
                                                                                    'metrics '
                                                                                    'to '
                                                                                    'lead '
                                                                                    'to '
                                                                                    'the '
                                                                                    'first '
                                                                                    'Elasticsearch '
                                                                                    'node '
                                                                                    'being '
                                                                                    'added '
                                                                                    'to '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'and '
                                                                                    'ready '
                                                                                    'to '
                                                                                    'process '
                                                                                    'indexing '
                                                                                    'load '
                                                                                    'is '
                                                                                    'under '
                                                                                    '1 '
                                                                                    'minute. '
                                                                                    'Conclusion '
                                                                                    'In '
                                                                                    'this '
                                                                                    'blog '
                                                                                    'post, '
                                                                                    'we '
                                                                                    'explained '
                                                                                    'how '
                                                                                    'ingest '
                                                                                    'autoscaling '
                                                                                    'works '
                                                                                    'in '
                                                                                    'Elasticsearch, '
                                                                                    'the '
                                                                                    'different '
                                                                                    'components '
                                                                                    'involved, '
                                                                                    'and '
                                                                                    'the '
                                                                                    'metrics '
                                                                                    'used '
                                                                                    'to '
                                                                                    'quantify '
                                                                                    'the '
                                                                                    'resources '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload. '
                                                                                    'We '
                                                                                    'believe '
                                                                                    'that '
                                                                                    'such '
                                                                                    'an '
                                                                                    'autoscaling '
                                                                                    'mechanism '
                                                                                    'is '
                                                                                    'crucial '
                                                                                    'to '
                                                                                    'reduce '
                                                                                    'the '
                                                                                    'operational '
                                                                                    'overhead '
                                                                                    'of '
                                                                                    'an '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'for '
                                                                                    'the '
                                                                                    'users '
                                                                                    'by '
                                                                                    'automatically '
                                                                                    'increasing '
                                                                                    'the '
                                                                                    'available '
                                                                                    'resources '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'when '
                                                                                    'necessary. '
                                                                                    'Furthermore, '
                                                                                    'it '
                                                                                    'leads '
                                                                                    'to '
                                                                                    'cost '
                                                                                    'reduction '
                                                                                    'by '
                                                                                    'scaling '
                                                                                    'down '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'when '
                                                                                    'the '
                                                                                    'available '
                                                                                    'resources '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'are '
                                                                                    'not '
                                                                                    'required '
                                                                                    'anymore. '
                                                                                    'Ready '
                                                                                    'to '
                                                                                    'try '
                                                                                    'this '
                                                                                    'out '
                                                                                    'on '
                                                                                    'your '
                                                                                    'own? '
                                                                                    'Start '
                                                                                    'a '
                                                                                    'free '
                                                                                    'trial '
                                                                                    '. '
                                                                                    'Want '
                                                                                    'to '
                                                                                    'get '
                                                                                    'Elastic '
                                                                                    'certified? '
                                                                                    'Find '
                                                                                    'out '
                                                                                    'when '
                                                                                    'the '
                                                                                    'next '
                                                                                    'Elasticsearch '
                                                                                    'Engineer '
                                                                                    'training '
                                                                                    'is '
                                                                                    'running! '
                                                                                    'Pooya '
                                                                                    'Salehi '
                                                                                    'Henning '
                                                                                    'Andersen '
                                                                                    'Francisco '
                                                                                    'Fernández '
                                                                                    'Castaño '
                                                                                    '11 '
                                                                                    'min '
                                                                                    'read '
                                                                                    '29 '
                                                                                    'July '
                                                                                    '2024 '
                                                                                    'Elastic '
                                                                                    'Cloud '
                                                                                    'Serverless '
                                                                                    'Share '
                                                                                    'Twitter '
                                                                                    'Facebook '
                                                                                    'LinkedIn '
                                                                                    'Recommended '
                                                                                    'Articles '
                                                                                    'Elastic '
                                                                                    'Cloud'},
                                                                           {'embeddings': {'##4': 0.5609497,
                                                                                           '##down': 0.011559885,
                                                                                           '##est': 1.1421111,
                                                                                           '##hi': 0.0060656513,
                                                                                           '##ing': 0.48465544,
                                                                                           '##ler': 0.12595108,
                                                                                           '##less': 1.3963115,
                                                                                           '##lessly': 0.76121324,
                                                                                           '##ling': 1.03232,
                                                                                           '##load': 0.6918682,
                                                                                           '##oya': 0.56508857,
                                                                                           '##rch': 0.94580704,
                                                                                           '##room': 1.397477,
                                                                                           '##sca': 1.4164101,
                                                                                           '##sea': 1.4075159,
                                                                                           '10': 0.005647892,
                                                                                           '15': 1.0004816,
                                                                                           '16': 0.0726173,
                                                                                           '202': 0.79451597,
                                                                                           'account': 0.054787852,
                                                                                           'accounting': 0.2977837,
                                                                                           'advantage': 0.13797385,
                                                                                           'after': 0.04746113,
                                                                                           'algorithm': 0.84724355,
                                                                                           'amazon': 0.7599511,
                                                                                           'analysis': 0.4048887,
                                                                                           'analyze': 0.12881227,
                                                                                           'andersen': 0.091110215,
                                                                                           'anya': 0.031511437,
                                                                                           'apache': 0.8387389,
                                                                                           'architect': 0.57877886,
                                                                                           'archive': 0.027499544,
                                                                                           'august': 0.523268,
                                                                                           'auto': 1.4506402,
                                                                                           'automatic': 0.94025064,
                                                                                           'availability': 0.348747,
                                                                                           'available': 0.05306761,
                                                                                           'blog': 0.8397168,
                                                                                           'bot': 0.38508278,
                                                                                           'bug': 0.1267487,
                                                                                           'build': 0.776895,
                                                                                           'building': 0.7504958,
                                                                                           'built': 0.19563165,
                                                                                           'calculate': 0.3598465,
                                                                                           'calculating': 0.11605539,
                                                                                           'calculation': 0.8540975,
                                                                                           'calculations': 0.57275534,
                                                                                           'capacity': 0.3109483,
                                                                                           'cave': 0.29021654,
                                                                                           'certification': 0.64684826,
                                                                                           'certified': 0.26541537,
                                                                                           'checkpoint': 0.06267695,
                                                                                           'chess': 0.22270066,
                                                                                           'class': 0.044449553,
                                                                                           'client': 0.05088419,
                                                                                           'cloud': 0.9856347,
                                                                                           'cluster': 1.8377897,
                                                                                           'clustered': 0.18159664,
                                                                                           'clusters': 0.79538465,
                                                                                           'collapse': 0.29267746,
                                                                                           'component': 0.012821147,
                                                                                           'components': 0.50653857,
                                                                                           'computer': 0.22416146,
                                                                                           'cost': 0.06086615,
                                                                                           'crawl': 0.27863678,
                                                                                           'data': 0.23600358,
                                                                                           'database': 0.386357,
                                                                                           'decrease': 0.29198787,
                                                                                           'deployment': 0.4085412,
                                                                                           'desired': 0.04168813,
                                                                                           'development': 0.0050133946,
                                                                                           'dimensions': 0.10934332,
                                                                                           'disadvantage': 0.33458805,
                                                                                           'domain': 0.16470446,
                                                                                           'down': 1.343148,
                                                                                           'downs': 0.2709486,
                                                                                           'drop': 0.19782026,
                                                                                           'during': 0.4177895,
                                                                                           'effect': 0.39730436,
                                                                                           'elastic': 1.9854976,
                                                                                           'engineer': 0.58167315,
                                                                                           'engineering': 0.5884908,
                                                                                           'ensemble': 0.007619722,
                                                                                           'facebook': 0.3225428,
                                                                                           'fernandez': 0.42895493,
                                                                                           'fifteen': 0.10546452,
                                                                                           'first': 0.50220585,
                                                                                           'forest': 0.14911638,
                                                                                           'framework': 0.047809396,
                                                                                           'free': 0.3561092,
                                                                                           'global': 0.09408311,
                                                                                           'group': 0.14574468,
                                                                                           'handling': 0.30345336,
                                                                                           'head': 0.117694445,
                                                                                           'hour': 0.3250166,
                                                                                           'hours': 0.70438623,
                                                                                           'implement': 0.13235687,
                                                                                           'implementation': 0.13236406,
                                                                                           'important': 0.055658367,
                                                                                           'improve': 0.2550515,
                                                                                           'increase': 0.74923754,
                                                                                           'increasing': 0.3597461,
                                                                                           'index': 1.4273754,
                                                                                           'indexed': 0.2932871,
                                                                                           'ing': 1.2874681,
                                                                                           'introduced': 0.10785041,
                                                                                           'inventory': 0.65916276,
                                                                                           'java': 0.88944626,
                                                                                           'july': 0.14186577,
                                                                                           'large': 0.06278902,
                                                                                           'latest': 0.068817586,
                                                                                           'learning': 0.12424224,
                                                                                           'length': 0.030345708,
                                                                                           'limit': 0.14073928,
                                                                                           'load': 1.0610044,
                                                                                           'loading': 0.39865428,
                                                                                           'loss': 0.11432742,
                                                                                           'machine': 0.029201662,
                                                                                           'maintenance': 0.15768714,
                                                                                           'management': 0.31734702,
                                                                                           'math': 0.406777,
                                                                                           'maximum': 0.13483465,
                                                                                           'measure': 0.5081328,
                                                                                           'mechanism': 0.8204686,
                                                                                           'memory': 1.0461255,
                                                                                           'metric': 0.9943368,
                                                                                           'mining': 0.5402124,
                                                                                           'minute': 0.92393905,
                                                                                           'minutes': 0.3759728,
                                                                                           'moment': 0.11160666,
                                                                                           'morris': 0.060925715,
                                                                                           'network': 0.51853234,
                                                                                           'node': 0.99145895,
                                                                                           'online': 0.36771652,
                                                                                           'operation': 0.28533393,
                                                                                           'overhead': 0.086819395,
                                                                                           'patience': 0.11310515,
                                                                                           'perfect': 0.12382903,
                                                                                           'performance': 0.06312573,
                                                                                           'process': 0.5356137,
                                                                                           'processing': 0.55718875,
                                                                                           'production': 0.05736718,
                                                                                           'project': 0.14496073,
                                                                                           'prototype': 0.31378728,
                                                                                           'quan': 0.22408743,
                                                                                           'ready': 0.25202373,
                                                                                           'reduce': 0.5264253,
                                                                                           'reduction': 0.037918843,
                                                                                           'research': 0.0142833255,
                                                                                           'resource': 0.09839988,
                                                                                           'resources': 0.7532266,
                                                                                           'rights': 0.08338795,
                                                                                           'room': 0.84089494,
                                                                                           'rs': 0.47752637,
                                                                                           'scala': 0.17796026,
                                                                                           'scale': 1.6349432,
                                                                                           'scaled': 0.39957505,
                                                                                           'scales': 0.24761787,
                                                                                           'scaling': 1.3751862,
                                                                                           'scope': 0.009172562,
                                                                                           'search': 0.6669978,
                                                                                           'seconds': 0.11594447,
                                                                                           'serial': 0.21314114,
                                                                                           'server': 1.1875997,
                                                                                           'servers': 0.3761195,
                                                                                           'share': 0.21588095,
                                                                                           'shrink': 0.08177304,
                                                                                           'si': 0.039096646,
                                                                                           'sid': 0.26323187,
                                                                                           'site': 0.27832702,
                                                                                           'size': 1.2518198,
                                                                                           'sizes': 0.68347317,
                                                                                           'small': 0.021309003,
                                                                                           'software': 0.21712899,
                                                                                           'sort': 0.46309024,
                                                                                           'step': 0.13614927,
                                                                                           'storage': 0.33423752,
                                                                                           'strategy': 0.2746019,
                                                                                           'swarm': 0.18959516,
                                                                                           'task': 0.12210263,
                                                                                           'time': 0.3716685,
                                                                                           'traffic': 0.0044686934,
                                                                                           'training': 0.56078845,
                                                                                           'trial': 0.30781624,
                                                                                           'tutor': 0.18126883,
                                                                                           'twitter': 0.7352328,
                                                                                           'useful': 0.07486964,
                                                                                           'user': 0.61840165,
                                                                                           'users': 0.5178945,
                                                                                           'wait': 0.12994274,
                                                                                           'weaving': 0.09568315,
                                                                                           'web': 0.3402482,
                                                                                           'website': 0.17116618,
                                                                                           'work': 0.38590312,
                                                                                           'working': 0.040917397,
                                                                                           'works': 0.2640411,
                                                                                           'years': 0.057129644},
                                                                            'text': 'cluster '
                                                                                    'size, '
                                                                                    'we '
                                                                                    'account '
                                                                                    'for '
                                                                                    'a '
                                                                                    '10% '
                                                                                    'headroom '
                                                                                    'when '
                                                                                    'calculating '
                                                                                    'the '
                                                                                    'desired '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'during '
                                                                                    'a '
                                                                                    'scale '
                                                                                    'down '
                                                                                    'and '
                                                                                    'a '
                                                                                    'scale '
                                                                                    'down '
                                                                                    'takes '
                                                                                    'effect '
                                                                                    'only '
                                                                                    'if '
                                                                                    'all '
                                                                                    'desired '
                                                                                    'cluster '
                                                                                    'size '
                                                                                    'calculations '
                                                                                    'within '
                                                                                    'the '
                                                                                    'past '
                                                                                    '15 '
                                                                                    'minute '
                                                                                    'have '
                                                                                    'indicated '
                                                                                    'a '
                                                                                    'scale-down. '
                                                                                    'Currently, '
                                                                                    'the '
                                                                                    'time '
                                                                                    'that '
                                                                                    'it '
                                                                                    'takes '
                                                                                    'for '
                                                                                    'an '
                                                                                    'increase '
                                                                                    'in '
                                                                                    'the '
                                                                                    'metrics '
                                                                                    'to '
                                                                                    'lead '
                                                                                    'to '
                                                                                    'the '
                                                                                    'first '
                                                                                    'Elasticsearch '
                                                                                    'node '
                                                                                    'being '
                                                                                    'added '
                                                                                    'to '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'and '
                                                                                    'ready '
                                                                                    'to '
                                                                                    'process '
                                                                                    'indexing '
                                                                                    'load '
                                                                                    'is '
                                                                                    'under '
                                                                                    '1 '
                                                                                    'minute. '
                                                                                    'Conclusion '
                                                                                    'In '
                                                                                    'this '
                                                                                    'blog '
                                                                                    'post, '
                                                                                    'we '
                                                                                    'explained '
                                                                                    'how '
                                                                                    'ingest '
                                                                                    'autoscaling '
                                                                                    'works '
                                                                                    'in '
                                                                                    'Elasticsearch, '
                                                                                    'the '
                                                                                    'different '
                                                                                    'components '
                                                                                    'involved, '
                                                                                    'and '
                                                                                    'the '
                                                                                    'metrics '
                                                                                    'used '
                                                                                    'to '
                                                                                    'quantify '
                                                                                    'the '
                                                                                    'resources '
                                                                                    'needed '
                                                                                    'to '
                                                                                    'handle '
                                                                                    'the '
                                                                                    'indexing '
                                                                                    'workload. '
                                                                                    'We '
                                                                                    'believe '
                                                                                    'that '
                                                                                    'such '
                                                                                    'an '
                                                                                    'autoscaling '
                                                                                    'mechanism '
                                                                                    'is '
                                                                                    'crucial '
                                                                                    'to '
                                                                                    'reduce '
                                                                                    'the '
                                                                                    'operational '
                                                                                    'overhead '
                                                                                    'of '
                                                                                    'an '
                                                                                    'Elasticsearch '
                                                                                    'cluster '
                                                                                    'for '
                                                                                    'the '
                                                                                    'users '
                                                                                    'by '
                                                                                    'automatically '
                                                                                    'increasing '
                                                                                    'the '
                                                                                    'available '
                                                                                    'resources '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'when '
                                                                                    'necessary. '
                                                                                    'Furthermore, '
                                                                                    'it '
                                                                                    'leads '
                                                                                    'to '
                                                                                    'cost '
                                                                                    'reduction '
                                                                                    'by '
                                                                                    'scaling '
                                                                                    'down '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'when '
                                                                                    'the '
                                                                                    'available '
                                                                                    'resources '
                                                                                    'in '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'are '
                                                                                    'not '
                                                                                    'required '
                                                                                    'anymore. '
                                                                                    'Ready '
                                                                                    'to '
                                                                                    'try '
                                                                                    'this '
                                                                                    'out '
                                                                                    'on '
                                                                                    'your '
                                                                                    'own? '
                                                                                    'Start '
                                                                                    'a '
                                                                                    'free '
                                                                                    'trial '
                                                                                    '. '
                                                                                    'Want '
                                                                                    'to '
                                                                                    'get '
                                                                                    'Elastic '
                                                                                    'certified? '
                                                                                    'Find '
                                                                                    'out '
                                                                                    'when '
                                                                                    'the '
                                                                                    'next '
                                                                                    'Elasticsearch '
                                                                                    'Engineer '
                                                                                    'training '
                                                                                    'is '
                                                                                    'running! '
                                                                                    'Pooya '
                                                                                    'Salehi '
                                                                                    'Henning '
                                                                                    'Andersen '
                                                                                    'Francisco '
                                                                                    'Fernández '
                                                                                    'Castaño '
                                                                                    '11 '
                                                                                    'min '
                                                                                    'read '
                                                                                    '29 '
                                                                                    'July '
                                                                                    '2024 '
                                                                                    'Elastic '
                                                                                    'Cloud '
                                                                                    'Serverless '
                                                                                    'Share '
                                                                                    'Twitter '
                                                                                    'Facebook '
                                                                                    'LinkedIn '
                                                                                    'Recommended '
                                                                                    'Articles '
                                                                                    'Elastic '
                                                                                    'Cloud '
                                                                                    'Serverless '
                                                                                    '• '
                                                                                    '15 '
                                                                                    'May '
                                                                                    '2024 '
                                                                                    'Building '
                                                                                    'Elastic '
                                                                                    'Cloud '
                                                                                    'Serverless '
                                                                                    'Explore '
                                                                                    'the '
                                                                                    'architectural '
                                                                                    'decisions '
                                                                                    'we '
                                                                                    'made '
                                                                                    'along '
                                                                                    'the '
                                                                                    'journey '
                                                                                    'of '
                                                                                    'building '
                                                                                    'Elastic '
                                                                                    'Cloud '
                                                                                    'Serverless. '
                                                                                    'Jason '
                                                                                    'Tedor '
                                                                                    'Pooya '
                                                                                    'Salehi '
                                                                                    'Henning '
                                                                                    'Andersen '
                                                                                    'Francisco '
                                                                                    'Fernández '
                                                                                    'Castaño '
                                                                                    '11 '
                                                                                    'min '
                                                                                    'read '
                                                                                    '29 '
                                                                                    'July '
                                                                                    '2024 '
                                                                                    'Elastic '
                                                                                    'Cloud '
                                                                                    'Serverless '
                                                                                    'Share '
                                                                                    'Twitter '
                                                                                    'Facebook '
                                                                                    'LinkedIn '
                                                                                    'Jump '
                                                                                    'to '
                                                                                    'Ingest '
                                                                                    'autoscaling '
                                                                                    'overview '
                                                                                    'Metrics '
                                                                                    'Ingestion '
                                                                                    'load '
                                                                                    'Memory '
                                                                                    'Scaling '
                                                                                    'the '
                                                                                    'cluster '
                                                                                    'Show '
                                                                                    'more '
                                                                                    'Sitemap '
                                                                                    'RSS '
                                                                                    'Feed '
                                                                                    'Search '
                                                                                    'Labs '
                                                                                    'Repo '
                                                                                    'Elastic.co '
                                                                                    '©2024. '
                                                                                    'Elasticsearch '
                                                                                    'B.V. '
                                                                                    'All '
                                                                                    'Rights '
                                                                                    'Reserved.'}],
                                                                'inference_id': 'my-elser-model',
                                                                'model_settings': {'task_type': 'sparse_embedding'}}},
                                'title': 'Elasticsearch ingest autoscaling — '
                                         'Search Labs',
                                'url': 'https://www.elastic.co/search-labs/blog/elasticsearch-ingest-autoscaling',
                                'url_host': 'www.elastic.co',
                                'url_path': '/search-labs/blog/elasticsearch-ingest-autoscaling',
                                'url_path_dir1': 'search-labs',
                                'url_path_dir2': 'blog',
                                'url_path_dir3': 'elasticsearch-ingest-autoscaling',
                                'url_port': 443,
                                'url_scheme': 'https'}}],
          'max_score': 1.2861483,
          'total': {'relation': 'eq', 'value': 228}},
 'timed_out': False,
 'took': 2}