Arize AI Anthropic Tracing Tutorial

Anthropic Tracing Tutorial

agentsllmsLlamaIndexarize-phoenixopenaitracingtutorialsevalsllmopsai-monitoringaiengineeringprompt-engineeringdatasetsllm-evalai-observabilityllm-evaluationsmolagentsanthropiclangchain

alph-notebooks/arize-phoenix / anthropic_tracing_tutorial.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Docs | GitHub | Community

Tracing and Evaluating a Structured Data Extraction Service

In this tutorial, you will:

Use Anthropic's tool calling to perform structured data extraction: the task of transforming unstructured input (e.g., user requests in natural language) into structured format (e.g., tabular format),
Instrument your Anthropic client to record trace data in OpenInference tracing format,
Inspect the traces and spans of your application to visualize your trace data,
Export your trace data to run an evaluation on the quality of your structured extractions.

Background

One powerful feature of the Anthropic API is tool use (function calling), wherein a user describes the signature and arguments of one or more functions to the Anthropic API via a JSON Schema and natural language descriptions, and the LLM decides when to call each function and provides argument values depending on the context of the conversation. In addition to its primary purpose of integrating function inputs and outputs into a sequence of chat messages, function calling is also useful for structured data extraction, since you can specify a "function" that describes the desired format of your structured output. Structured data extraction is useful for a variety of purposes, including ETL or as input to another machine learning model such as a recommender system.

While it's possible to produce structured output without using function calling via careful prompting, function calling is more reliable at producing output that conforms to a particular format. For more details on Anthropic's function calling API, see the Anthropic documentation.

Let's get started!

ℹ️ This notebook requires an Anthropic API key.

1. Install Dependencies and Import Libraries

Install dependencies.

[ ]

Import libraries.

[ ]

2. Configure Your Anthropic API Key and Instantiate Your Anthropic Client

Set your Anthropic API key if it is not already set as an environment variable.

[ ]

3. Instrument Your Anthropic Client

Instrument your Anthropic client with a tracer that emits telemetry data in OpenInference format. OpenInference is an open standard for capturing and storing LLM application traces that enables LLM applications to seamlessly integrate with LLM observability solutions such as Phoenix.

[ ]

4. Run Phoenix in the Background

Launch Phoenix as a background session to collect the trace data emitted by your instrumented OpenAI client.

[ ]

5. Extract Your Structured Data

We'll extract structured data from the following list of ten travel requests.

[ ]

The Anthropic API uses JSON Schema and natural language descriptions to specify the signature of a function to call. In this case, we'll describe a function to record the following attributes of the unstructured text input:

location: The desired destination,
budget_level: A categorical budget preference,
purpose: The purpose of the trip.

The use of JSON Schema enables us to define the type of each field in the output and even enumerate valid values in the case of categorical outputs. Anthropic function calling can thus be used for tasks that might previously have been performed by named-entity recognition (NER) and/ or classification models.

[ ]

Run the extractions.

[ ]

Your trace data should appear in real-time in the Phoenix UI.

[ ]

6. Export and Evaluate Your Trace Data

Your OpenInference trace data is collected by Phoenix and can be exported to a pandas dataframe for further analysis and evaluation.

[ ]

7. Recap

Congrats! In this tutorial, you:

Built a service to perform structured data extraction on unstructured text using Anthropic function calling
Instrumented your service with an OpenInference tracer
Examined your telemetry data in Phoenix

Check back soon for tips on evaluating the performance of your service using LLM evals.