Advanced RAG Context Enrichment Window
Advanced RAG: Context Enrichment Window
Vanilla RAG is great, but some situations need smaller chunks because larger ones can add unnecessary noise, like conversation history. Using couple-level chunks can work, but important context might be lost from previous or future replies. Bigger chunks could help, but they come with their own issues, like noise and limited chunk numbers. What's the Solution: Context Enrichment.
Let's see How it can be done.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.7/41.7 kB 2.3 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 34.8/34.8 MB 57.7 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 470.2/470.2 kB 29.7 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.2/11.2 MB 75.6 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 38.2 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 4.1 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.1/24.1 MB 14.9 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 363.4/363.4 MB 4.3 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.8/13.8 MB 51.0 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.6/24.6 MB 32.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 883.7/883.7 kB 61.0 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 2.1 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.5/211.5 MB 5.1 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 15.8 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.9/127.9 MB 7.7 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 MB 5.4 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 122.3 MB/s eta 0:00:00
Download data
mkdir: cannot create directory ‘./data’: File exists --2025-07-29 18:27:18-- https://ncert.nic.in/textbook/pdf/jess301.pdf Resolving ncert.nic.in (ncert.nic.in)... 164.100.166.133 Connecting to ncert.nic.in (ncert.nic.in)|164.100.166.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 5639921 (5.4M) [application/pdf] Saving to: ‘./data/history_chapter.pdf’ ./data/history_chap 100%[===================>] 5.38M 2.43MB/s in 2.2s 2025-07-29 18:27:21 (2.43 MB/s) - ‘./data/history_chapter.pdf’ saved [5639921/5639921]
Table creation and data ingestion
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: The secret `HF_TOKEN` does not exist in your Colab secrets. To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session. You will be able to reuse this secret in all of your notebooks. Please note that authentication is recommended but still optional to access public models or datasets. warnings.warn(
modules.json: 0%| | 0.00/349 [00:00<?, ?B/s]
config_sentence_transformers.json: 0%| | 0.00/124 [00:00<?, ?B/s]
README.md: 0.00B [00:00, ?B/s]
sentence_bert_config.json: 0%| | 0.00/52.0 [00:00<?, ?B/s]
config.json: 0%| | 0.00/743 [00:00<?, ?B/s]
model.safetensors: 0%| | 0.00/133M [00:00<?, ?B/s]
tokenizer_config.json: 0%| | 0.00/366 [00:00<?, ?B/s]
vocab.txt: 0.00B [00:00, ?B/s]
tokenizer.json: 0.00B [00:00, ?B/s]
special_tokens_map.json: 0%| | 0.00/125 [00:00<?, ?B/s]
config.json: 0%| | 0.00/190 [00:00<?, ?B/s]
/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py:1750: FutureWarning: `encoder_attention_mask` is deprecated and will be removed in version 4.55.0 for `BertSdpaSelfAttention.forward`. return forward_call(*args, **kwargs)
So we have created our table where each text chink has an index associated with it. Let's now do a simple search.
The important chunks from the query are 14, 86, and 16.
Using a NEIGHBOUR_WINDOW=1, we get the following chunk IDs: 13, 14, 15; 85, 86, 87; and 15, 16, 17.
Notice chunk ID 15 appears in two groups. It makes sense to associate it with the higher priority group (14), which has the minimum distance. Now, let's write the code to get the neighbors.
{16: 0.6005803346633911,
, 15: 0.6005803346633911,
, 17: 0.6005803346633911,
, 49: 0.6166247129440308,
, 48: 0.6166247129440308,
, 50: 0.6166247129440308,
, 8: 0.6333437561988831,
, 7: 0.6333437561988831,
, 9: 0.6333437561988831} Now let's group and rerank these chunks
{0: [15, 16, 17], 1: [48, 49, 50], 2: [7, 8, 9]} Now We simply go group by group and remove the overlapping prefix from the second entry onwards.
## Context - 1: From the very beginning, the French revolutionaries introduced various measures and practices that could create a sense of collective identity amongst the French people. The ideas of la patrie (the fatherland) and le citoyen (the citizen) emphasised the notion of a united community enjoying equal rights under a constitution. A new French flag, the tricolour, was chosen to replace the former royal standard. The Estates General was elected by thew hymns were composed, oaths taken and martyrs commemorated, all in the name of the nation. A centralised administrative system was put in place and it formulated uniform laws for all citizens within its territory. Internal customs duties and dues were abolished and a uniform system of weights and measures was adopted.Regional dialects were discouraged and French, as it was spoken and written in Paris, became the common language of the nation. The revolutionaries further declared that it was the mission and the destiny of the French nation to liberate the peoples of Europe from despotism, in other words to help other peoples of Europe to become nations. ## Context - 2: economic nationalism strengthened the wider nationalist sentiments growing at the time. 2.3 A New Conservatism after 1815 Following the defeat of Napoleon in 1815, European governments were driven by a spirit of conservatism. Conservatives believed that established, traditional institutions of state and society – like the monarchy, the Church, social hierarchies, property and the family – should be preserved. Most conservatives, however, did not proposey realised, from the changes initiated by Napoleon, that modernisation could in fact strengthen traditional institutions like the monarchy. It could make state power more effective and strong. A modern army, an efficient bureaucracy, a dynamic economy, the abolition of feudalism and serfdom could strengthen the autocratic monarchies of Europe.sia, Prussia and Austria – who had collectively defeated Napoleon, met at Vienna to draw up a settlement for Europe. The Congress was hosted by the Austrian Chancellor Duke Metternich. The delegates Economists began to think in terms of the national economy. They talked of how the nation could develop and what economic measures could help forge this nation together. ## Context - 3: the world. This chapter will deal with many of the issues visualised by Sorrieu in Fig. 1. During the nineteenth century, nationalism emerged as a force which brought about sweeping changes in the political and mental world of Europe. The end result of these changes was the emergence of the nation-state in place of the multi-national dynastic empires of Europe. The concept and practices of a modern state, in which a centralised power exercised sovereign control over a clearlyf time in Europe. But a nation-state was one in which the majority of its citizens, and not only its rulers, came to develop a sense of common identity and shared history or descent. This commonness did not exist from time immemorial; it was forged through struggles, through the actions of leaders and the common people. This chapter willtionalism came into being in nineteenth-century Europe. Ernst Renan, ‘What is a Nation?’ In a lecture delivered at the University of Sorbonne in 1882, the French philosopher Ernst Renan (1823-92) outlined his understanding of what makes a nation. The lecture was subsequently published as a famous essay entitled ‘Qu’est-ce qu’une nation?’ (‘What is a Nation?’).