Weights and Biases Quickstart Huggingface

Quickstart Huggingface

wandb-examplesweavedocs

alph-notebooks/wandb-examples / quickstart_huggingface.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Hugging Face

Hugging Face Hub is a machine learning platform for creators and collaborators containing pre-trained models and datasets for your projects. It also offers an easy and unified access to serverless AI inference through multiple inference providers, like Together AI, Sambanova and Fireworks AI.

You can easily browse supported providers and models directly on the Hub - for example, all Fireworks AI supported models can be found here.

The huggingface_hub library provides a simple and efficient interface for running inference on Hugging Face models across providers through the InferenceClient.

Let's first install huggingface_hub and weave libraries:

[ ]

Authentication

With a single Hugging Face Token, you can access inference through multiple providers. Your calls are routed through Hugging Face and the usage is billed directly to your Hugging Face account at the standard provider API rates.

To get started:

Create your Hugging Face Token at https://huggingface.co/settings/tokens.
Set the HF_TOKEN environment variable by either:
- Adding it to your Google Colab secrets.
- Or using the code below:

[ ]

Tracing

It’s important to store traces of language model applications in a central location, both during development and in production. These traces can be useful for debugging, and as a dataset that will help you improve your application.

To use a model from the Hugging Face Hub, you need to specify the provider when initializing the InferenceClient object. You can find the list of supported providers here.

The following example shows how to use the Llama-3.2-11B-Vision-Instruct model through Together AI. Weave will automatically capture traces for InferenceClient.

To start tracking, call weave.init() and use the library as normal.

[ ]

Track your own ops

Wrapping a function with @weave.op starts capturing inputs, outputs and app logic so you can debug how data flows through your app. You can deeply nest ops and build a tree of functions that you want to track. This also starts automatically versioning code as you experiment to capture ad-hoc details that haven't been committed to git.

Simply create a function decorated with @weave.op.

In the example below, we have the functions generate_image, check_image_correctness, and generate_image_and_check_correctness which are wrapped with @weave.op that generates an image and checks if it is correct for a given prompt.

[ ]

Create a `Model` for easier experimentation

Organizing experimentation is difficult when there are many moving pieces. By using the Model class, you can capture and organize the experimental details of your app like your system prompt or the model you're using. This helps organize and compare different iterations of your app.

In addition to versioning code and capturing inputs/outputs, a Model captures structured parameters that control your application’s behavior, making it easy to find what parameters worked best. You can also use Weave a Model with serve, and Evaluations.

In the example below, you can experiment with CityVisitRecommender. Every time you change one of these, you'll get a new version of CityVisitRecommender.

[ ]