Introduction to prompt design

Prompt design is the process of creating prompts that elicit the desired response from language models. Writing well structured prompts is an essential part of ensuring accurate, high quality responses from a language model. This page introduces some basic concepts, strategies, and best practices to get you started in designing prompts.

What is a prompt

A prompt is a natural language request submitted to a language model to receive a response back. Prompts can contain questions, instructions, contextual information, examples, and partial input for the model to complete or continue. After the model receives a prompt, depending on the type of model being used, it can generate text, embeddings, code, images, videos, music, and more.

Prompt content types

Prompts can include one or more of the following types of content:

Input

An input is the text in the prompt that you want the model to provide a response for, and it's a required content type. Inputs can be a question that the model answers (question input), a task the model performs (task input), an entity the model operates on (entity input), or partial input that the model completes or continues (completion input).

Question input

A question input is a question that you ask the model that the model provides an answer to.

Task input

A task input is a task that you want the model to perform. For example, you can tell the model to give you ideas or suggestions for something.

Entity input

An entity input is what the model performs an action on, such as classify or summarize. This type of input can benefit from the inclusion of instructions.

Completion input

A completion input is text that the model is expected to complete or continue.

Context

Context can be one of the following:

  • Instructions that specify how the model should behave.
  • Information that the model uses or references to generate a response.

Add contextual information in your prompt when you need to give information to the model, or restrict the boundaries of the responses to only what's within the prompt.

Examples

Examples are input-output pairs that you include in the prompt to give the model an example of an ideal response. Including examples in the prompt is an effective strategy for customizing the response format.

General prompt design strategies

Prompt design enables users who are new to machine learning (ML) control model output with minimal overhead. By carefully crafting prompts, you can nudge the model to generate a desired result. Prompt design is an efficient way to experiment with adapting a language model for a specific use case.

Language models, especially large language models (LLM), are trained on vast amounts of text data to learn the patterns and relationships between words. When given some text (the prompt), language models can predict what is likely to come next, like a sophisticated autocompletion tool. Therefore, when designing prompts, consider the different factors that can influence what a model predicts comes next.

While there's no right or wrong way to design a prompt, there are common strategies that you can use to affect the model's responses. This section introduces you to some common prompt design strategies.

Give clear instructions

Giving the model instructions means telling the model what to do. This strategy can be an effective way to customize model behavior. Ensure that the instructions you give are clear and concise.

The following prompt provides a block of text and tells the model to summarize it:

The model provided a concise summary, but maybe you want the summary to be written in a way that's easier to understand. For example, the following prompt includes an instruction to write a summary that's simple enough for a fifth grader to understand:

The instruction to write the summary so a fifth grader can understand it resulted in a response that's easier to understand.

Summary:

  • Give the models instructions to customize its behavior.
  • Make each instruction clear and concise.

Include examples

You can include examples in the prompt that show the model what getting it right looks like. The model attempts to identify patterns and relationships from the examples and apply them to form a response. Prompts that contain a few examples are called few-shot prompts, while prompts that provide no examples are called zero-shot prompts. Few-shot prompts are often used to regulate the formatting, phrasing, scoping, or general patterning of model responses.

Zero-shot vs few-shot prompts

The following zero-shot prompt asks the model to choose the best explanation.

If your use case requires the model to produce concise responses, you can include examples in the prompt that give preference to concise responses.

The following prompt provides two examples that show preference to the shorter explanations. In the response, you can see that the examples guided the model to choose the shorter explanation (Explanation2) as opposed to the longer explanation (Explanation1) like it did previously.

Find the optimal number of examples

You can experiment with the number of examples to provide in the prompt for the most desired results. Models like PaLM can often pick up on patterns using a few examples, though you may need to experiment with what number of examples lead to the desired results. For simpler models like BERT, you may need more examples. At the same time, if you include too many examples, the model may start to overfit the response to the examples.

Use examples to show patterns instead of antipatterns

Using examples to show the model a pattern to follow is more effective than using examples to show the model an antipattern to avoid.

Negative pattern:

Positive pattern:

Summary:

  • Including prompt-response examples in the prompt helps the model learn how to respond.
  • Give the model examples of the patterns to follow instead of examples of patterns to avoid.
  • Experiment with the number of prompts to include. Depending on the model, too few examples are ineffective at changing model behavior. Too many examples cause the model to overfit.

Let the model complete partial input

Generative language models work like an advanced autocompletion tool. When you provide partial content, the model can provide the rest of the content or what it thinks is a continuation of that content as a response. When doing so, if you include any examples or context, the model can take those examples or context into account.

The following example provides a prompt with an instruction and an entity input:

While the model did as prompted, writing out the instructions in natural language can sometimes be challenging. In this case, you can give an example and a response prefix and let the model complete it:

Notice how "waffles" was excluded from the output because it wasn't listed in the context as a valid field.

Prompt the model to format its response

The completion strategy can also help format the response. The following example prompts the model to create an essay outline:

The prompt didn't specify the format for the outline and the model chose a format for you. To get the model to return an outline in a specific format, you can add text that represents the start of the outline and let the model complete it based on the pattern that you initiated.

Summary:

  • If you give the model a partial input, the model completes that input based on any available examples or context in the prompt.
  • Having the model complete an input may sometimes be easier than describing the task in natural language.
  • Adding a partial answer to a prompt can guide the model to follow a desired pattern or format.

Add contextual information

You can include in the prompt instructions and information that the model needs to solve a problem instead of assuming that the model has all of the required information.

The following example asks the model to give troubleshooting guidance for a router:

The response looks like generic troubleshooting information that's not specific to the router or the status of the LED indicator lights.

To customize the response for the specific router, you can add to the prompt the router's troubleshooting guide as context for it to refer to when providing a response.

Summary:

  • Include information (context) in the prompt that you want the model to use when generating a response.
  • Give the model instructions on what to do.

Add prefixes

A prefix is a word or phrase that you add to the prompt content that can serve several purposes, depending on where you put the prefix:

  • Input prefix: Adding a prefix to the input signals semantically meaningful parts of the input to the model. For example, the prefixes "English:" and "French:" demarcate two different languages.
  • Output prefix: Even though the output is generated by the model, you can add a prefix for the output in the prompt. The output prefix gives the model information about what's expected as a response. For example, the output prefix "JSON:" signals to the model that the output should be in JSON format.
  • Example prefix: In few-shot prompts, adding prefixes to the examples provide labels that the model can use when generating the output, which makes it easier to parse output content.

In the following example, "Text:" is the input prefix and "The answer is:" is the output prefix.

Experiment with different parameter values

Each call that you send to a model includes parameter values that control how the model generates a response. The model can generate different results for different parameter values. Experiment with different parameter values to get the best values for the task. The parameters available for different models may differ. The most common parameters are the following:

  • Max output tokens
  • Temperature
  • Top-K
  • Top-P

Max output tokens

Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words.

Specify a lower value for shorter responses and a higher value for longer responses.

Temperature

The temperature is used for sampling during response generation, which occurs when topP and topK are applied. Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a more deterministic and less open-ended or creative response, while higher temperatures can lead to more diverse or creative results. A temperature of 0 is deterministic, meaning that the highest probability response is always selected.

For most use cases, try starting with a temperature of 0.2. If the model returns a response that's too generic, too short, or the model gives a fallback response, try increasing the temperature.

Top-K

Top-K changes how the model selects tokens for output. A top-K of 1 means the next selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of 3 means that the next token is selected from among the three most probable tokens by using temperature.

For each token selection step, the top-K tokens with the highest probabilities are sampled. Then tokens are further filtered based on top-P with the final token selected using temperature sampling.

Specify a lower value for less random responses and a higher value for more random responses. The default top-K is 40.

Top-P

Top-P changes how the model selects tokens for output. Tokens are selected from the most (see top-K) to least probable until the sum of their probabilities equals the top-P value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top-P value is 0.5, then the model will select either A or B as the next token by using temperature and excludes C as a candidate.

Specify a lower value for less random responses and a higher value for more random responses. The default top-P is 0.95.

Prompt iteration strategies

Prompt design is an iterative process that often requires a few iterations before you get the desired response consistently. This section provides guidance on some things you can try when iterating on your prompts.

Use different phrasing

Using different words or phrasing in your prompts often yield different responses from the model even though they all mean the same thing. If you're not getting the expected results from your prompt, try rephrasing it.

Switch to an analogous task

If you can't get the model to follow your instructions for a task, try giving it instructions for an analogous task that achieves the same result.

This prompt tells the model to categorize a book by using predefined categories.

The response is correct, but the model didn't stay within the bounds of the options. You also want to model to just respond with one of the options instead of in a full sentence. In this case, you can rephrase the instructions as a multiple choice question and ask the model to choose an option.

Change the order of prompt content

The order of the content in the prompt can sometimes affect the response. Try changing the content order and see how that affects the response.

Version 1:
[examples]
[context]
[input]

Version 2:
[input]
[examples]
[context]

Version 3:
[examples]
[input]
[context]

Fallback responses

A fallback response is a response returned by the model when either the prompt or the response triggers a safety filter. An example of a fallback response is "I'm not able to help with that, as I'm only a language model."

If the model responds with a fallback response, try increasing the temperature.

Things to avoid

  • Avoid relying on models to generate factual information.
  • Use with care on math and logic problems.

Next steps

  • Now that you have a deeper understanding of prompt design, try writing your own prompts using MakerSuite.