Notebooks
d
deepset
Hallucination Score Calculator

Hallucination Score Calculator

agentic-aiagenticagentsgenaiAIhaystack-cookbookgenai-usecaseshaystack-ainotebooksPythonragai-tools

Calculating a Hallucination Score with the OpenAIChatGenerator

In this cookbook we will show how to calculate a hallucination risk based on the research paper LLMs are Bayesian, in Expectation, not in Realization and this GitHub repo, https://github.com/leochlon/hallbayes.

In this notebook, we'll use the OpenAIChatGenerator from haystack-experimental.

Setup Environment

[ ]

Set up OpenAI API Key

[13]
Enter OpenAI API key: ········

Closed Book Example

Based on the example from the original GitHub repo here

[14]
[35]
Decision: ANSWER
Risk bound: 0.000
Rationale: Δ̄=8.2088 nats, B2T=1.8947, ISR=4.332 (thr=1.000), extra_bits=0.200; EDFL RoH bound=0.000; y='answer'
Answer:
The 2019 Nobel Prize in Physics was awarded to three scientists for their contributions to understanding the universe. Half of the prize went to James Peebles for his theoretical discoveries in physical cosmology. The other half was jointly awarded to Michel Mayor and Didier Queloz for their discovery of an exoplanet orbiting a solar-type star.

Evidence-based Example

Based on the example from the original GitHub repo here

[17]
[34]
Decision: ANSWER
Risk bound: 0.541
Rationale: Δ̄=12.0000 nats, B2T=1.8947, ISR=6.333 (thr=1.000), extra_bits=0.200; EDFL RoH bound=0.541; y='answer'
Answer:
The Nobel Prize in Physics in 2019 was awarded to James Peebles, who received half of the prize, and to Michel Mayor and Didier Queloz, who shared the other half of the prize.

RAG-based Example

Create a Document Store and index some documents

[19]
[20]
2

Create a RAG Question Answering pipeline

[21]
[38]
<haystack.core.pipeline.pipeline.Pipeline object at 0x1426632e0>
,🚅 Components
,  - retriever: InMemoryBM25Retriever
,  - prompt_builder: ChatPromptBuilder
,  - llm: OpenAIChatGenerator
,🛤️ Connections
,  - retriever.documents -> prompt_builder.documents (list[Document])
,  - prompt_builder.prompt -> llm.messages (list[ChatMessage])

Run a query that is answerable based on the evidence

[39]
[40]
Decision: ANSWER
Risk bound: 0.541
Rationale: Δ̄=12.0000 nats, B2T=1.8947, ISR=6.333 (thr=1.000), extra_bits=0.200; EDFL RoH bound=0.541; y='answer'
Answer:
The Nobel Prize in Physics in 2019 was awarded to James Peebles (1/2), and Michel Mayor & Didier Queloz (1/2).

Run a query that should not be answered

[41]
[42]
Decision: REFUSE
Risk bound: 1.000
Rationale: Δ̄=0.0000 nats, B2T=1.8947, ISR=0.000 (thr=1.000), extra_bits=0.200; EDFL RoH bound=1.000; y='refuse'
Answer:
The evidence provided does not include information about the Nobel Prize in Physics for the year 2022. Therefore, I cannot answer the question based on the evidence provided.