Completions With Dynamic Prompt
azure-openai-samplesBasic_SamplesCompletions
Export
Dynamic Prompting for task completion
Recent papers such as Do Prompt-Based Models Really Understand the Meaning of their Prompts? and What Makes Good In-Context Examples for GPT-3? have shown that using dynamic set of examples instead of fixed set of examples help GPT-3 to perfom the task with higher accuracy.
[1]
[2]
True
Dataset Summary
The Text REtrieval Conference (TREC) Question Classification dataset is a dataset for question classification consisting of open-domain, fact-based questions divided into broad semantic categories. It contains 5500 labeled questions in training set and another 500 for test set.
The dataset has 6 coarse class labels and 50 fine class labels. Average length of each sentence is 10, vocabulary size of 8700.
[3]
[4]
DatasetDict({
, train: Dataset({
, features: ['text', 'coarse_label', 'fine_label'],
, num_rows: 5452
, })
, test: Dataset({
, features: ['text', 'coarse_label', 'fine_label'],
, num_rows: 500
, })
,}) [5]
Task Prompt
[6]
Setup OpenAI API
[7]
[8]
Zero-shot Prompt
Example of the zero-shot prompt
[9]
As a Question Answering agent, your goal is to categorize questions into different semantic classes that impose constraints on potential answers, so that they can be utilized in later stages of the question answering process. Following are the semantic classes: [ABBR, ENTY, DESC, HUM, LOC, NUM] Classify the following question into one of the above classes. Please answer in a single word. question: How far is it from Denver to Aspen ? output:
[10]
[11]
precision recall f1-score support
0 0.69 1.00 0.82 9
1 0.30 0.69 0.42 94
2 0.70 0.15 0.25 138
3 0.88 0.35 0.51 65
4 0.70 0.90 0.78 81
5 0.84 0.80 0.82 113
accuracy 0.56 500
macro avg 0.69 0.65 0.60 500
weighted avg 0.68 0.56 0.54 500
Few-shot Prompt
[12]
Example of the few-shot prompt
[13]
As a Question Answering agent, your goal is to categorize questions into different semantic classes that impose constraints on potential answers, so that they can be utilized in later stages of the question answering process. Following are the semantic classes: [ABBR, ENTY, DESC, HUM, LOC, NUM] Following are some examples. question: What is the full form of .com ? output: ABBR question: What films featured the character Popeye Doyle ? output: ENTY question: How did serfdom develop in and then leave Russia ? output: DESC question: What contemptible scoundrel stole the cork from my lunch ? output: HUM question: What sprawling U.S. state boasts the most airports ? output: LOC question: When was Ozzy Osbourne born ? output: NUM Classify the following question into one of the above classes. Please answer in a single word. question: How far is it from Denver to Aspen ? output:
[14]
[15]
precision recall f1-score support
0 0.75 1.00 0.86 9
1 0.41 0.57 0.48 94
2 0.80 0.51 0.62 138
3 0.96 0.69 0.80 65
4 0.93 0.93 0.93 81
5 0.80 1.00 0.89 113
accuracy 0.73 500
macro avg 0.77 0.78 0.76 500
weighted avg 0.77 0.73 0.73 500
Extract Embeddings for Training dataset
[16]
Dynamic Few-shot Prompt
[17]
[18]
Example of the dynamic prompt
[ ]
[20]
[21]
precision recall f1-score support
0 0.69 1.00 0.82 9
1 0.66 0.76 0.70 94
2 0.88 0.71 0.79 138
3 0.95 0.91 0.93 65
4 0.94 0.90 0.92 81
5 0.88 0.99 0.93 113
accuracy 0.84 500
macro avg 0.83 0.88 0.85 500
weighted avg 0.85 0.84 0.84 500