Text2text Generation Bloomz
SageMaker JumpStart Foundation Models - BloomZ: Multilingual Text Classification, Question and Answering, Code Generation, Paragraph rephrase, and More
This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.
Welcome to Amazon SageMaker JumpStart! You can use Sagemaker JumpStart to solve many Machine Learning tasks through one-click in SageMaker Studio, or through SageMaker Python SDK.
In this demo notebook, we demonstrate how to use the SageMaker Python SDK for deploying Foundation Models as an endpoint and use them for various NLP tasks. The Foundation models perform Text2Text Generation. It takes a prompting text as an input, and returns the text generated by the model according to the prompt.
Here, we show how to use the state-of-the-art pre-trained BloomZ 7b1 from Hugging Face for Text2Text Generation in the following tasks. You can directly use BloomZ 7b1 for many NLP tasks, without finetuning the model.
- Multilingual text / sentiment classification
- Multilingual question and answering
- Code generation
- Paragraph rephrase
- Summarization
- Common sense reasoning / natural language inference
- Question and answering
- Sentence / sentiment classification
- Translation
- Pronoun resolution
- Imaginary article generation based on title
- Summarize a title based on a article
Note: This notebook was tested on ml.t3.medium instance in Amazon SageMaker Studio with conda_pytorch_p39 kernel and in Amazon SageMaker Notebook instance with conda_python3 kernel.
1. Set Up
Before executing the notebook, there are some initial steps required for set up. This notebook requires ipywidgets.
Permissions and environment variables
To host on Amazon SageMaker, we need to set up and authenticate the use of AWS services. Here, we use the execution role associated with the current notebook as the AWS account role with SageMaker access.
2. Select a pre-trained model
You can continue with the default model, or can choose a different model from the dropdown generated upon running the next cell. A complete list of SageMaker pre-trained models can also be accessed at Sagemaker pre-trained Models.
[Optional] Select a different Sagemaker pre-trained model. Here, we download the model_manifest file from the Built-In Algorithms s3 bucket, filter-out all the Text Generation models and select a model for inference.
Choose a model for Inference
3. Retrieve Artifacts & Deploy an Endpoint
Using SageMaker, we can perform inference on the pre-trained model, even without fine-tuning it first on a new dataset. We start by retrieving the deploy_image_uri, deploy_source_uri, and model_uri for the pre-trained model. To host the pre-trained model, we create an instance of sagemaker.model.Model and deploy it. This may take a few minutes.
We need to create a directory to host the downloaded model.
4. Query endpoint and parse response
Input to the endpoint is any string of text formatted as json and encoded in utf-8 format. Output of the endpoint is a json with generated text.
Below, we put in some example input text. You can put in any text and the model predicts next words in the sequence. Longer sequences of text can be generated by calling the model repeatedly.
5. Advanced features: How to use various advanced parameters to control the generated text
This model also supports many advanced parameters while performing inference. They include:
- max_length: Model generates text until the output length (which includes the input context length) reaches
max_length. If specified, it must be a positive integer. - num_return_sequences: Number of output sequences returned. If specified, it must be a positive integer.
- num_beams: Number of beams used in the greedy search. If specified, it must be integer greater than or equal to
num_return_sequences. - no_repeat_ngram_size: Model ensures that a sequence of words of
no_repeat_ngram_sizeis not repeated in the output sequence. If specified, it must be a positive integer greater than 1. - temperature: Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If
temperature-> 0, it results in greedy decoding. If specified, it must be a positive float. - early_stopping: If True, text generation is finished when all beam hypotheses reach the end of stence token. If specified, it must be boolean.
- do_sample: If True, sample the next word as per the likelyhood. If specified, it must be boolean.
- top_k: In each step of text generation, sample from only the
top_kmost likely words. If specified, it must be a positive integer. - top_p: In each step of text generation, sample from the smallest possible set of words with cumulative probability
top_p. If specified, it must be a float between 0 and 1. - seed: Fix the randomized state for reproducibility. If specified, it must be an integer.
We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments
6. Advanced features: How to use prompts engineering to solve different tasks
Below we domenstrate solving 12 key tasks with BloomZ 7B1 model. The tasks are following.
- Multilingual text / sentiment classification
- Multilingual question and answering
- Code generation
- Paragraph rephrase
- Summarization
- Common sense reasoning / natural language inference
- Question and answering
- Sentence / sentiment classification
- Translation
- Pronoun resolution
- Imaginary article generation based on title
- Summarize a title based on a article
6.1. Multilingual text / sentiment classification (Chinese to English)
6.2. Multilingual question and answering (English to Chinese)
6.3. Code generation
6.4. Paragraph rephrase
6.5. Summarization
Define the text article you want to summarize.
6.6. Common sense reasoning / natural language inference
In the common sense reasoning, you can design a prompt and combine it with the premise, hypothesis, and options, send the combined text into the endpoint to get an answer. Examples are demonstrated as below.
Define the premise, hypothesis, and options that you hope the model to reason.
6.7. Question and Answering
Now, let's try another reasoning task with a different type of prompt template. You can simply provide context and question as shown below.
6.8. Sentence / Sentiment Classification
Define the sentence you want to classifiy and the corresponded options.
Now let's try if the model can identify the sentence is grammatically correct or not.
6.9. Translation
Define the sentence and the language you want to translate the sentence to.
6.10. Imaginary article generation based on title
6.11. Summarize a title based on the article
7. Clean up the endpoint
Notebook CI Test Results
This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.