Zero Shot

hf-notebooksensetfit_doc

Zero-shot Text Classification

Your class names are likely already good descriptors of the text that you're looking to classify. With 🤗 SetFit, you can use these class names with strong pretrained Sentence Transformer models to get a strong baseline model without any training samples.

This guide will show you how to perform zero-shot text classification.

Testing dataset

We'll use the dair-ai/emotion dataset to test the performance of our zero-shot model.

[ ]

This dataset stores the class names within the dataset Features, so we'll extract the classes like so:

[ ]

Otherwise, we could manually set the list of classes.

Synthetic dataset

Then, we can use get_templated_dataset() to synthetically generate a dummy dataset given these class names.

[ ]
[ ]

Training

We can use this dataset to train a SetFit model just like normal:

[ ]
***** Running training *****
  Num examples = 60
  Num epochs = 1
  Total optimization steps = 60
  Total train batch size = 32
{'embedding_loss': 0.2628, 'learning_rate': 3.3333333333333333e-06, 'epoch': 0.02}                                                                                 
{'embedding_loss': 0.0222, 'learning_rate': 3.7037037037037037e-06, 'epoch': 0.83}                                                                                 
{'train_runtime': 15.4717, 'train_samples_per_second': 124.098, 'train_steps_per_second': 3.878, 'epoch': 1.0}                                                     
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [00:09<00:00,  6.35it/s]

Once trained, we can evaluate the model:

[ ]
***** Running evaluation *****
{'accuracy': 0.591}

And run predictions:

[ ]
[ ]

These predictions all look right!

Baseline

To show that the zero-shot performance of SetFit works well, we'll compare it against a zero-shot classification model from transformers.

[ ]
[ ]

With its 59.1% accuracy, the 0-shot SetFit heavily outperforms the recommended zero-shot model by transformers.

Prediction latency

Beyond getting higher accuracies, SetFit is much faster too. Let's compute the latency of SetFit with BAAI/bge-small-en-v1.5 versus the latency of transformers with facebook/bart-large-mnli. Both tests were performed on a GPU.

[ ]
`transformers` with `facebook/bart-large-mnli` latency: 31.1765ms per sentence
[ ]
SetFit with `BAAI/bge-small-en-v1.5` latency: 0.4600ms per sentence

So, SetFit with BAAI/bge-small-en-v1.5 is 67x faster than transformers with facebook/bart-large-mnli, alongside being more accurate:

zero_shot_transformers_vs_setfit