PaLM API Overview

The PaLM API allows developers to use state-of-the-art Large Language Models (LLMs), to build language applications. This overview contains programming language independent information about the API. Once you're familiar with the general features available to you through the API, try a quickstart for your language of choice (Python, Node, Java, Swift) to start developing.

If you're looking for enterprise-level safety, privacy, security and scalability, try the PaLM API on Google Cloud's Vertex AI. Visit the Overview of generative AI support on Vertex AI to learn more.

PaLM API for Text and Chat

The Palm API provides multiple text generation capabilities. You can generate text using the text or chat services. This section describes both and highlights some key differences and use-cases for each to help you get started. When you're ready to start developing, you can find complete runnable code in the text quickstart and the chat quickstart.

Compare text and chat services

Both the text service and the chat service are designed to help you build powerful applications with natural language processing (NLP) capabilities. However, they are optimized for different types of applications and use cases.

Text service

The text service is designed for single turn interactions. It's ideal for tasks that can be completed within one response from the API, without the need for a continuous conversation. The text service lets you to obtain text completions, generate summaries, or perform other NLP tasks that don't require back-and-forth interactions.

For example, you can send a prompt such as:

Rewrite the following text

Followed by some text to rewrite.

See the text quickstart to try out this service.

Chat service

The chat service is designed for interactive, multi-turn conversations. The service lets you to create applications that engage users in dynamic and context-aware conversations. You can provide a context for the conversation as well as examples of conversation turns for the model to follow. It's ideal for applications that require ongoing communication, such as chatbots, interactive tutors, or customer support assistants.

For example, you can send in a context such as:

Pretend you're a snowman and stay in character for each response

with some examples of input and model output such as:

Hi, who are you?

I'm a snowman!

The responses attempts to align with the context and examples provided.

See the chat quickstart to try out this service.

Prompt design

As you can see from the examples in the previous sections, you influence the output you get from the model using prompts. This is referred to as prompt engineering or prompt design. Creating effective prompts is a combination of art and science. See the prompt guidelines for guidance on how to approach prompting and the prompt 101 guide to learn about different types of prompting.

Choose the right service

When deciding between the text and chat services, consider the types of interactions you want to enable.

  • If your application primarily requires single-turn interactions and tasks that can be completed in one response such as summaries or translations, start with the text service.
  • If your application involves ongoing conversations, or multi-turn interactions, start with the chat service.

Generate text

To generate text using the text service, send a GenerateTextRequest. The request requires 2 parameters, model, and prompt.

  • model: This is required and must match one of the text models available.
  • prompt: This field is required and contains the string prompt that you're sending to the model.

See the text reference for more details on the options of each required and optional fields.

The following shows the method for each supported language.

Curl

Use the generateText method for the request.

curl https://generativelanguage.googleapis.com/v1beta3/models/text-bison-001:generateText?key=$PALM_API_KEY \
        -H 'Content-Type: application/json' \
        -X POST \
        -d '{
            "prompt": {
                  "text": "Write a story about a magic backpack."
                  },
            "temperature": 1.0,
            "candidate_count": 3}'

Python

Use the generate_text method for the request.

  completion = palm.generate_text(model=model, prompt=x)

See the text quickstart for Python for the full example.

Node.js

Use the generateText method for the request.

 client
    .generateText({
      model: MODEL_NAME,
      prompt: {
        text: prompt,
      },
    })
    .then((result) => {
      console.log(JSON.stringify(result));
    });

See the text quickstart for Node.js for a full example.

Swift

Use the generateText method for the request.

let response = try await client?.generateText(with: "Write a story about a magic backpack.",
                                              temperature: 0.3,
                                              candidateCount: 3)
if let candidate = response?.candidates?.first, let text = candidate.output {
  print(text)
}

See the text quickstart for Swift for a full example.

Generate message

To generate chat responses using the chat service, send a GenerateMessageRequest. The request requires 2 parameters, model, and prompt.

  • model: This is required and must match one of the embedding models available.
  • prompt: This field is required. It must contain at least one message and can optionally include a context for the chat as well as examples.

See the chat reference for more details on the options of each required and optional fields.

The following shows the method for each supported language.

Curl

Use the generateMessage method for the request.

curl  -H 'Content-Type: application/json' \
        -X POST \
        https://generativelanguage.googleapis.com/v1beta3/models/chat-bison-001:generateMessage?key=$PALM_API_KEY  \
        -d '{
            "prompt": {"messages": [{"content":"hi"}]},
            "temperature": 0.1,
            "candidate_count": 1}'

Python

Use the chat method for the request.

  response = palm.chat(messages=x)

See the chat quickstart for Python for the full example.

Node.js

Use the generateMessage method for the request.

 async function main() {
    const result = await client.generateMessage({
      model: MODEL_NAME,
      prompt: {
        messages: [{ content: "How tall is the Eiffel Tower?" }],
      },
    });

    console.log(result[0].candidates[0].content);
  }

See the chat quickstart for Node.js for a full example.

Java

Create a GenerateMessageRequest by passing a model name and prompt to the GenerateMessageRequest.Builder:

private fun createMessageRequest(prompt: MessagePrompt): GenerateMessageRequest {
    return GenerateMessageRequest.newBuilder()
        .setModel(MODEL_NAME) // Required
        .setPrompt(prompt) // Required
        .build()
}

See the chat quickstart for Java for a full example.

Swift

Use the chat method for the request.

let response = client?.chat(message: "How tall is the Eiffel Tower?",
                            context: "You are a tourist guide. Answer all questions with some level of detail.",
                            temperature: 0.3,
                            candidateCount: 3)
if let candidate = response?.candidates?.first, let text = candidate.content {
  print(text)
}

See the chat quickstart for Swift for a full example.

PaLM API for Embeddings

The embedding service in the PaLM API generates state-of-the-art embeddings for words, phrases, and sentences. The resulting embeddings can then be used for NLP tasks, such as semantic search, text classification and clustering among many others. This section describes what embeddings are and highlights some key use cases for the embedding service to help you get started. When you're ready to start developing, you can find complete runnable code in the embeddings quickstart.

What are embeddings?

Text embeddings are a natural language processing (NLP) technique that converts text into numerical vectors. Embeddings capture semantic meaning and context which results in text with similar meanings having closer embeddings. For example, the sentence "I took my dog to the vet" and "I took my cat to the vet" would have embeddings that are close to each other in the vector space since they both describe similar context.

This is important because it unlocks many algorithms that can operate on vectors but not directly on text.

You can use these embeddings or vectors to compare different texts and understand how they relate. For example, if the embeddings of the text "cat" and "dog" are close together you can infer that these words are similar in meaning or context or both. This ability allows a variety of uses cases described in the next section.

Use cases

Text embeddings power a variety of NLP use cases. For example:

  • Information Retrieval: The goal is to retrieve semantically similar text given a piece of input text. A variety of applications can be supported by an information retrieval system such as semantic search, answering questions, or summarization. See the document search notebook for an example.
  • Classification: You can use embeddings to train a model to classify documents into categories. For example, if you want to classify user comments as negative or positive, you can use the embeddings service to get the vector representation of each comment to train the classifier.
  • Clustering: Comparing vectors of text can show how similar or different they are. This feature can be used to train a clustering model that groups similar text or documents together.
  • Vector DB: You can store your generated embeddings in a vector DB to improve the accuracy and efficiency of your NLP application. For example, you can use a vector DB to improve the capabilities of a document search.

Generate embeddings

To generate embeddings, send an EmbedTextRequest to the embedding service. The request requires 2 parameters, model, and text.

  • model: This is required and must match one of the embedding models available.
  • text: This field is required and can be a word, phrase or document within the token limit.

See the embeddings reference for more details on the options of each field.

The following shows the method for each supported language.

Curl

curl -H 'Content-Type: application/json' \
       -X POST \
       https://generativelanguage.googleapis.com/v1beta3/models/embedding-gecko-001:embedText?key=$PALM_API_KEY \
       -d '{"text": "say something nice!"}'

Python

Use the generate_embeddings method for the request.

  embedding_x = palm.generate_embeddings(model=model, text=x)

See the embeddings quickstart for Python for the full example.

Node.js

Use the embedText method for the request.

 client
   .embedText({
     model: MODEL_NAME,
     text: text,
   })
   .then((result) => {
     console.log(JSON.stringify(result));
   });

See the embedding quickstart for Node.js for the full example.

Tune models

The tuning service gives you the ability to tune the model behind the text service of the PaLM API. The goal of model tuning is to further improve the performance of the model for a specific task. Model tuning works by providing the model with a training dataset containing many examples of the task. For niche tasks, you can get significant improvements in model performance by tuning the model on a modest number of examples.

See the tuning guide to learn more.

Models

To get the most current model information, use the list_models method to list all the models available and then the get_model method to get the metadata for a particular model. Run any of the quickstarts, such as the embeddings quickstart, to see these methods in action.

Further reading