Setup OpenAI
Let’s start by opening the OpenAI Platform. Make sure that you are signed in. Here, we will first look at the OpenAI Playground and then we will create an API key. We need this key to use the OpenAI API with our Python code.
Explore the OpenAI Platform
Explore the OpenAI Docs
The OpenAI API is very well documented: platform.openai.com/docs/overview.
Building prompts
When we were exploring the OpenAI Playground, we saw that the messages we pass to the LLM consist of a system prompt and a user prompt. The different roles in the messages are fully documented in the OpenAI API docs.
Understanding Roles in the messages
Argument
When using the chat completions API, you create prompts by providing an array of messages that contain instructions for the model. Each message can have a different role, which influences how the model might interpret the input. Each entry in the messages
list is a dictionary with a role
and a content
. The role
specifies who is “speaking,” which helps the model generate contextually appropriate responses.
Roles and Their Purposes
1. system
- Purpose: Sets the behavior, tone, and personality of the LLM. Think of this as the “guiding principles” for the model.
- When to Use: At the beginning of a conversation to establish how the assistant should behave.
- Example:
{"role": "system", "content": "You are a helpful and polite assistant."}
Effect:
- It tells the model to frame all its responses according to the specified behavior.
- For example, defining the assistant as “concise” will encourage brief replies.
2. user
- Purpose: Represents the input from the person using the model.
- When to Use: Every time the user provides input or asks a question.
- Example:
{"role": "user", "content": "Can you explain the roles in the messages argument?"}
- Effect:
- The model treats this as a direct prompt to respond.
- The user’s input frames the assistant’s reply.
3. assistant
- Purpose: Represents the AI’s responses in the conversation.
- When to Use: To show the model what it has previously said, especially in multi-turn interactions.
- Example:
{"role": "assistant", "content": "Of course! Here's an explanation of the roles..."}
- Effect:
- By including prior responses, you ensure the model has full context for the ongoing conversation.
4. function
(Optional, Advanced)
- Purpose: Represents a structured response when calling functions integrated with the AI.
- When to Use: In applications where the AI triggers external functions (e.g., retrieving weather data or performing calculations).
- Example:
{"role": "function",
"name": "get_weather",
"content": "{\"location\": \"Zurich\"}"}
- Effect:
- Used in function-calling mode to indicate what data or output the function provides.
Putting It All Together
Here’s an example of a complete messages
argument in a conversational context:
[
{"role": "system", "content": "You are a friendly travel assistant."},
{"role": "user", "content": "Can you suggest a good vacation spot for December?"},
{"role": "assistant", "content": "Sure! How about visiting the Swiss Alps for skiing?"},
{"role": "user", "content": "That sounds great. What else can I do there?"}
]
By structuring your prompts as an array of messages with different roles, you can have more control over the conversation flow and provide additional context or instructions to the model as needed.
LLM Parameters
When generating responses from a language model, there are several parameters that can be adjusted to control the output behavior. Here are some common parameters:
Temperature:
- Controls the randomness or creativity of the generated text.
- Higher temperature (e.g., 0.8-1.0) produces more diverse and unpredictable outputs.
- Lower temperature (e.g., 0.2-0.5) generates more conservative and focused responses.
Top-k Sampling:
- Limits the model’s vocabulary to the top-k most likely tokens at each step.
- Helps control the diversity of the output by restricting the model’s choices.
- Higher values of k allow more diverse outputs, while lower values produce more focused responses.
Top-p Sampling (Nucleus Sampling):
- Controls the diversity of generated text by dynamically selecting from the most likely tokens.
- Works by considering only tokens whose cumulative probability exceeds a threshold (p).
- For example, with \(p=0.1\), the model only considers the 10% most probable tokens.
- Higher values (closer to 1.0) allow more diverse outputs, while lower values produce more focused responses.
Repetition Penalty:
- Penalizes the model for repeating the same or similar text patterns.
- Helps prevent excessive repetition and promotes more diverse outputs.
Max Length:
- Specifies the maximum number of tokens (words or subwords) in the generated output.
- Useful for controlling the length of the response and preventing excessive verbosity.
These parameters can be adjusted based on the specific use case and desired output characteristics. For example, higher temperature and top-k/top-p values may be preferred for creative writing tasks, while lower values could be more suitable for factual or analytical tasks where precision is important.
Suggestion for using temperature
and top-p
Top-p sampling, also known as nucleus sampling, is a way to control the diversity of the text generated by a language model. It works by considering the most likely words or tokens at each step, but instead of just taking the top few, it takes the smallest set of words that make up a certain percentage (p) of the total probability.
For example, if p is set to 0.9, the model will consider the words that make up the top 90% of the probability distribution at each step. This allows for more diverse and creative outputs compared to just taking the single most likely word.
You can use the temperature
and top-p
parameters in combination to make the LLM more creative or more focused.
If you find that you need to lower the top-p
value below 0.5 (or 50%) to keep the outputs coherent, it may be better to lower the temperature instead. Then, you can try adjusting the top-p
value again to find the right balance between diversity and coherence.
The key is to experiment with different combinations of temperature
and top-p
values to achieve the desired level of creativity and coherence for your specific use case.
Setting up the OpenAI API Key
When working with the OpenAI API, you need to set up an API key. The API key is a unique identifier that allows you to authenticate and access the OpenAI language models and services. It acts as a secure credential, granting you authorized access to the API endpoints. This key should be kept secret—please do not share it.
To set up an OpenAI API key, follow these steps:
- Go to the OpenAI API Settings page and navigate to the
API Keys
section (Settings > API Keys
). - Click on the “Create new secret key” button.
- Leave the
Permissions
set to the default value (All permissions
). - Give your key a descriptive name and click “Create secret key”.
- Copy the generated secret key. This is the key you’ll use to authenticate your API requests.
- Make sure to store the key securely, as you would with any other sensitive credential (e.g. in a password manager). Do not share or commit this key to version control.
Once you have your API key, you can set it as an environment variable or pass it directly to the OpenAI Python library when making API calls.