Notebooks
E
Elastic
Rag Gemma Huggingface Elastic

Rag Gemma Huggingface Elastic

openai-chatgptlangchain-pythonchatgptgenaielasticsearchgemmaelasticopenaiAIintegrationschatlogvectordatabasenotebooksPythonsearchgenaistackvectorelasticsearch-labslangchainapplications

RAG: Using Gemma LLM locally for question answering on private data

In this notebook, our aim is to develop a RAG system utilizing Google's Gemma model. We'll generate vectors with Elastic's ELSER model and store them in Elasticsearch. Additionally, we'll explore semantic retrieval techniques and present the top search results as a context window to the Gemma model. Furthermore, we'll utilize the Hugging Face transformer library to load Gemma on a local environment.

Setup

Elastic Credentials - Create an Elastic Cloud deployment to get all Elastic credentials (ELASTIC_CLOUD_ID, ELASTIC_API_KEY).

Hugging Face Token - To get started with the Gemma model, it is necessary to agree to the terms on Hugging Face and generate the access token with write role.

Gemma Model - We're going to use gemma-2b-it, though Google has released 4 open models. You can use any of them i.e. gemma-2b, gemma-7b, gemma-7b-it

Install packages

[ ]

Import packages

[ ]

Get Credentials

[ ]

Add documents

Let's download the sample dataset and deserialize the document.

[ ]

Split Documents into Passages

[ ]

Index Documents into Elasticsearch using ELSER

Before we begin indexing, ensure you have downloaded and deployed the ELSER model in your deployment and is running on the ML node.

[ ]

Hugging Face login

[ ]

Initialize the tokenizer with the model (google/gemma-2b-it)

[ ]

Create a text-generation pipeline and initialize with LLM

[ ]

Format Docs

[ ]

Create a chain using Prompt template

[ ]

Ask question

[55]
'Answer: The sales goals are to increase revenue, expand market share, and strengthen customer relationships in our target markets.'