One popular choice is the ChatML
 format, and this is a good, flexible choice for many use-cases. It looks like this:
{%- for message in messages %}
{{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
Source: HF Chat Templates - see the article (HF docs) for full details.
OpenAI switching to ChatML was announced in Introducing ChatGPT and Whisper APIs where they write:
API: Traditionally, GPT models consume unstructured text, which is represented to the model as a sequence of âtokens.â ChatGPT models instead consume a sequence of messages together with metadata. (For the curious: under the hood, the input is still rendered to the model as a sequence of âtokensâ for the model to consume; the raw format used by the model is a new format called Chat Markup Languageâ (opens in a new window) (âChatMLâ).)
Code example (OpenAI; Python bindings)
import openai
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "What is the OpenAI mission?"}]
)
print(completion)
To learn more about the GPT-3.5 API, visit our Chat guideâ (opens in a new window).
See also:
- The Introduction Of Chat Markup Language (ChatML) Is Important For A Number Of Reasons quick explainer / exposition of ChatML format
- OpenAI Introduced Chat Markup Language (ChatML) Based Input To Non-Chat Modes by Cobus Greyling Medium
- chatml openai-python
- Chat Markup Language ChatML (Preview) - Azure