Setup OpenAI
Now that we have set up our Python environment, we can start using the OpenAI API.
Let’s open the OpenAI Platform. Make sure that you are logged in. Here, we will first look at the OpenAI Playground and then we will create an API key. We need this key to use the OpenAI API from our Python code.
Explore the OpenAI Platform
Explore the OpenAI Docs
The OpenAI API is very well documented: platform.openai.com/docs/overview.
Building prompts
When we were exploring the OpenAI Playground, we saw that the messages we pass to the LLM consist of a system prompt and a user prompt. The different roles in the messages are fully documented in the OpenAI API docs.
Understanding Roles in the messages
Argument
When using the chat completions API, you create prompts by providing an array of messages that contain instructions for the model. Each message can have a different role, which influences how the model might interpret the input. Each entry in the messages
list is a dictionary with a role
and a content
. The role
specifies who is “speaking,” which helps the model generate contextually appropriate responses.
Roles and Their Purposes
1. system
- Purpose: Sets the behavior, tone, and personality of the AI. Think of this as the “guiding principles” for the model.
- When to Use: At the beginning of a conversation to establish how the assistant should behave.
- Example:
{"role": "system", "content": "You are a helpful and polite assistant."}
Effect:
- It tells the model to frame all its responses according to the specified behavior.
- For example, defining the assistant as “concise” will encourage brief replies.
2. user
- Purpose: Represents the input from the person using the model.
- When to Use: Every time the user provides input or asks a question.
- Example:
{"role": "user", "content": "Can you explain the roles in the messages argument?"}
- Effect:
- The model treats this as a direct prompt to respond.
- The user’s input frames the assistant’s reply.
3. assistant
- Purpose: Represents the AI’s responses in the conversation.
- When to Use: To show the model what it has previously said, especially in multi-turn interactions.
- Example:
{"role": "assistant", "content": "Of course! Here’s an explanation of the roles..."}
- Effect:
- By including prior responses, you ensure the model has full context for the ongoing conversation.
4. function
(Optional, Advanced)
- Purpose: Represents a structured response when calling functions integrated with the AI.
- When to Use: In applications where the AI triggers external functions (e.g., retrieving weather data or performing calculations).
- Example:
{"role": "function",
"name": "get_weather",
"content": "{\"location\": \"Zurich\"}"}
- Effect:
- Used in function-calling mode to indicate what data or output the function provides.
We will not use the function
role in this workshop.
Putting It All Together
Here’s an example of a complete messages
argument in a conversational context:
[
{"role": "system", "content": "You are a friendly travel assistant."},
{"role": "user", "content": "Can you suggest a good vacation spot for December?"},
{"role": "assistant", "content": "Sure! How about visiting the Swiss Alps for skiing?"},
{"role": "user", "content": "That sounds great. What else can I do there?"}
]
By structuring your prompts as an array of messages with different roles, you can have more control over the conversation flow and provide additional context or instructions to the model as needed.
LLM Parameters
When generating responses from a language model, there are several parameters that can be adjusted to control the output behavior. Here are some common parameters:
Temperature:
- Controls the randomness or creativity of the generated text.
- Higher temperature (e.g., 0.8-1.0) produces more diverse and unpredictable outputs.
- Lower temperature (e.g., 0.2-0.5) generates more conservative and focused responses.
Top-k Sampling:
- Limits the model’s vocabulary to the top-k most likely tokens at each step.
- Helps control the diversity of the output by restricting the model’s choices.
- Higher values of k allow more diverse outputs, while lower values produce more focused responses.
Top-p Sampling (Nucleus Sampling):
- Similar to top-k sampling, but instead of considering the top-k tokens, it considers the smallest set of tokens whose cumulative probability exceeds a specified threshold (p).
- Allows for more dynamic control over the diversity of the output.
Repetition Penalty:
- Penalizes the model for repeating the same or similar text patterns.
- Helps prevent excessive repetition and promotes more diverse outputs.
Max Length:
- Specifies the maximum number of tokens (words or subwords) in the generated output.
- Useful for controlling the length of the response and preventing excessive verbosity.
These parameters can be adjusted based on the specific use case and desired output characteristics. For example, higher temperature and top-k/top-p values may be preferred for creative writing tasks, while lower values could be more suitable for factual or analytical tasks where precision is important.
Suggestion for using temperature
and top-p
Top-p sampling, also known as nucleus sampling, is a way to control the diversity of the text generated by a language model. It works by considering the most likely words or tokens at each step, but instead of just taking the top few, it takes the smallest set of words that make up a certain percentage (p) of the total probability.
For example, if p is set to 0.9, the model will consider the words that make up the top 90% of the probability distribution at each step. This allows for more diverse and creative outputs compared to just taking the single most likely word.
You can use the temperature
and top-p
parameters in combination to make the LLM more creative or more focused.
Increase the temperature parameter to make the model’s outputs more diverse and creative. However, if the temperature is too high, the outputs may become nonsensical or incoherent. In that case, you can lower the top p value to restrict the model’s vocabulary and make the outputs more focused and coherent.
If you find that you need to lower the top-p
value below 0.5 (or 50%) to keep the outputs coherent, it may be better to lower the temperature instead. Then, you can try adjusting the top-p
value again to find the right balance between diversity and coherence.
The key is to experiment with different combinations of temperature
and top-p
values to achieve the desired level of creativity and coherence for your specific use case.
Setting up the OpenAI API Key
When working locally with the OpenAI API, you need to set up an API key. The API key is a unique identifier that allows you to authenticate and access the OpenAI language models and services. It acts as a secure credential, granting you authorized access to the API endpoints. This key should be kept secret—please do not share it.
To set up an OpenAI API key, follow these steps:
- Go to the OpenAI API Settings page and navigate to the
API Keys
section (Settings > API Keys
). - Click on the “Create new secret key” button.
- Leave the
Permissions
set to the default value (All permissions
). - Give your key a descriptive name and click “Create secret key”.
- Copy the generated secret key. This is the key you’ll use to authenticate your API requests.
- Store the key securely, as you would with any other sensitive credential. Do not share or commit this key to version control.
Once you have your API key, you can set it as an environment variable or pass it directly to the OpenAI Python library when making API calls.
Setting the API key as an environment variable
In your VSCode workspace, create a new file called .env
. In this file, add the following line:
OPENAI_API_KEY=<your-api-key>
where <your-api-key>
is the API key you copied in the previous step. It should look like this:
OPENAI_API_KEY=sk-proj-...
Install the OpenAI Python Package
To use the OpenAI API in Python, we need to install the OpenAI Python package. You can do this by running the following command in your terminal:
pip install openai
You should see output like this:
❯ pip install openai
Collecting openai
Downloading openai-1.55.1-py3-none-any.whl.metadata (24 kB)
Collecting anyio<5,>=3.5.0 (from openai)
Using cached anyio-4.6.2.post1-py3-none-any.whl.metadata (4.7 kB)
Collecting distro<2,>=1.7.0 (from openai)
Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting httpx<1,>=0.23.0 (from openai)
and then:
Installing collected packages: typing-extensions, tqdm, sniffio, jiter, idna, h11, distro, certifi, annotated-types, pydantic-core, httpcore, anyio, pydantic, httpx, openai
Successfully installed annotated-types-0.7.0 anyio-4.6.2.post1 certifi-2024.8.30 distro-1.9.0 h11-0.14.0 httpcore-1.0.7 httpx-0.27.2 idna-3.10 jiter-0.8.0 openai-1.55.1 pydantic-2.10.2 pydantic-core-2.27.1 sniffio-1.3.1 tqdm-4.67.1 typing-extensions-4.12.2
Install the dotenv package
We need to install the python-dotenv
package to load the API key from the .env
file. You can install the dotenv package by running the following command in your terminal:
pip install python-dotenv
Testing the OpenAI API
You can test the OpenAI API by creating a new python file (e.g., test-openai.py
) and running the following code:
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
= OpenAI()
client
= client.chat.completions.create(
response ="gpt-4o-mini",
model=[{"role": "system", "content": "You are a helpful assistant."},
messages"role": "user", "content": "What is the weather in Bern?"}]
{
)
print(response)
To run Python code in VSCode, you can can either run the entire file or select the code that you want to run and press Shift+Enter
. You may be asked to install the ipykernel
package the first time you run a Python file. If so, go ahead and select the Install
option. Alternatively, you can install it by running pip install ipykernel
in your terminal.
If you run the code, you should see output like this:
= ChatCompletion(
response id='chatcmpl-AYDrMDAY2RokWtkN47ljomaZML0DH',
=[
choices
Choice(='stop',
finish_reason=0,
index=None,
logprobs=ChatCompletionMessage(
message='I\'m sorry, but I cannot provide real-time weather updates. To get the current weather in Bern, I recommend checking a reliable weather website or app for the latest forecast and conditions.',
content=None,
refusal='assistant',
role=None,
audio=None,
function_call=None
tool_calls
)
)
],=1732719792,
created='gpt-4o-mini-2024-07-18',
modelobject='chat.completion',
=None,
service_tier='fp_0705bf87c0',
system_fingerprint=CompletionUsage(
usage=37,
completion_tokens=24,
prompt_tokens=61,
total_tokens=CompletionTokensDetails(
completion_tokens_details=0,
accepted_prediction_tokens=0,
audio_tokens=0,
reasoning_tokens=0
rejected_prediction_tokens
),=PromptTokensDetails(
prompt_tokens_details=0,
audio_tokens=0
cached_tokens
)
) )
The ChatCompletion object is the response received from the OpenAI API when you make a request for text completion or generation. It contains several key pieces of information:
id
: A unique identifier for the specific API request.choices
: A list of possible completions generated by the AI model. In this case, there is only one choice.choices[0].message.content
: The actual text generated by the AI model in response to your prompt.created
: The timestamp when the API request was processed.model
: The name of the AI model used to generate the response.usage
: Details about the number of tokens used for the prompt and the generated text.
To put it simply, the ChatCompletion object represents the AI’s response to your input, including the generated text and some metadata about the request and the model used.
If you just want to get the text of the response, you can access it with response.choices[0].message.content
.
print(response.choices[0].message.content)
In this case, you should see the following output:
I'm sorry, but I cannot provide real-time weather updates.
To get the current weather in Bern, I recommend checking
a reliable weather website or app for the latest forecast and conditions.
Testing the OpenAI API with a Jupyter Notebook
You can also test the OpenAI API with a Jupyter Notebook. To do this, create a new Jupyter Notebook and insert the following code in individual cells.
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
= OpenAI() client
= client.chat.completions.create(
response ="gpt-4o-mini",
model=[{"role": "system", "content": "You are a helpful assistant."},
messages"role": "user", "content": "What is the weather in Bern?"}]
{ )
print(response)
print(response.choices[0].message.content)
Example workspace
For convenience, you can download an example workspace here:
Once you have downloaded the ZIP file, unzip it and open the folder in VSCode. You can then create a virtual environment and install the dependencies. Once you have done this, you can run the code in the Jupyter Notebook cells or the Python file as described above.
Reuse
Citation
@online{ellis2024,
author = {Ellis, Andrew},
title = {Setup {OpenAI}},
date = {2024-11-13},
url = {https://virtuelleakademie.github.io/promptly-engineered/workshop/openai/},
langid = {en}
}