Hallucination Score Calculator
agentic-aiagenticagentsgenaiAIhaystack-cookbookgenai-usecaseshaystack-ainotebooksPythonragai-tools
Export
Calculating a Hallucination Score with the OpenAIChatGenerator
In this cookbook we will show how to calculate a hallucination risk based on the research paper LLMs are Bayesian, in Expectation, not in Realization and this GitHub repo, https://github.com/leochlon/hallbayes.
In this notebook, we'll use the OpenAIChatGenerator from haystack-experimental.
Setup Environment
[ ]
Set up OpenAI API Key
[13]
Enter OpenAI API key: ········
Closed Book Example
Based on the example from the original GitHub repo here
[14]
[35]
Decision: ANSWER Risk bound: 0.000 Rationale: Δ̄=8.2088 nats, B2T=1.8947, ISR=4.332 (thr=1.000), extra_bits=0.200; EDFL RoH bound=0.000; y='answer' Answer: The 2019 Nobel Prize in Physics was awarded to three scientists for their contributions to understanding the universe. Half of the prize went to James Peebles for his theoretical discoveries in physical cosmology. The other half was jointly awarded to Michel Mayor and Didier Queloz for their discovery of an exoplanet orbiting a solar-type star.
Evidence-based Example
Based on the example from the original GitHub repo here
[17]
[34]
Decision: ANSWER Risk bound: 0.541 Rationale: Δ̄=12.0000 nats, B2T=1.8947, ISR=6.333 (thr=1.000), extra_bits=0.200; EDFL RoH bound=0.541; y='answer' Answer: The Nobel Prize in Physics in 2019 was awarded to James Peebles, who received half of the prize, and to Michel Mayor and Didier Queloz, who shared the other half of the prize.
RAG-based Example
Create a Document Store and index some documents
[19]
[20]
2
Create a RAG Question Answering pipeline
[21]
[38]
<haystack.core.pipeline.pipeline.Pipeline object at 0x1426632e0> ,🚅 Components , - retriever: InMemoryBM25Retriever , - prompt_builder: ChatPromptBuilder , - llm: OpenAIChatGenerator ,🛤️ Connections , - retriever.documents -> prompt_builder.documents (list[Document]) , - prompt_builder.prompt -> llm.messages (list[ChatMessage])
Run a query that is answerable based on the evidence
[39]
[40]
Decision: ANSWER Risk bound: 0.541 Rationale: Δ̄=12.0000 nats, B2T=1.8947, ISR=6.333 (thr=1.000), extra_bits=0.200; EDFL RoH bound=0.541; y='answer' Answer: The Nobel Prize in Physics in 2019 was awarded to James Peebles (1/2), and Michel Mayor & Didier Queloz (1/2).
Run a query that should not be answered
[41]
[42]
Decision: REFUSE Risk bound: 1.000 Rationale: Δ̄=0.0000 nats, B2T=1.8947, ISR=0.000 (thr=1.000), extra_bits=0.200; EDFL RoH bound=1.000; y='refuse' Answer: The evidence provided does not include information about the Nobel Prize in Physics for the year 2022. Therefore, I cannot answer the question based on the evidence provided.