Update ChatRole.cs #5753

Licantrop0 · 2024-12-19T00:35:43Z

Adds developer role for o1 and newer models.
See: Developer messages

Microsoft Reviewers: Open in CodeFlow

Adds `developer` role for `o1` and newer models.

stephentoub · 2024-12-19T00:40:38Z

src/Libraries/Microsoft.Extensions.AI.Abstractions/ChatCompletion/ChatRole.cs

+
+    /// <summary>Gets the role that sets developer-provided instructions the model should follow, regardless of messages sent by the user.</summary>
+    /// <remarks>With o1 models and newer, developer messages replace the previous system messages.</remarks>
+    public static ChatRole Developer { get; } = new("developer");


I'm not convinced we want to add this. The other roles are all fairly common across multiple services. Right now this one is specific to OpenAI. We don't have "model" as a first-class API, for example, even though it's used by gemini. Until the library is updated, this won't even work with the openai provider.

Developer messages would effectively replace system messages for all new OpenAI models:
https://cdn.openai.com/spec/model-spec-2024-05-08.html#overview

Using system messages will throw errors.
How would you like to proceed?

From the linked document:

According to this, "Developer" simply replaces "System". So I would expect the OpenAI adapter to transmit "System" messages as "Developer" (if it knows which OpenAI models expect System and which expect Developer).

This seems like an unfortunate design that could confuse the ecosystem. Maybe there's a good reason for it but it's not clear from the naming alone.

If it does end up creating widespread confusion we could add ChatRole.Developer with the expectation that other clients will transmit it as "System", and then consumers can use whichever role name they prefer. But again, it's a shame that the ecosystem would be put in this position.

Today, if you try to send system messages with o1, it will throw an error saying that system messages are not supported.
@SteveSandersonMS, you are suggesting adding custom code in CompleteAsync / CompleteStreamingAsync to transform every system message in a developer message?

@Licantrop0 Yes, though it would have to be only for the models that require this. But it still might not work end-to-end - it depends whether the underlying OpenAI client supports it yet.

It would be interesting to try this and see if it does work, and how confident we can be about mapping System->Developer based on the model. I don't know if this would be a valid solution for Azure OpenAI since in that case we don't necessarily know what model is being used, but only the deployment name.

Also, we would have to scan through the full list of chat messages replacing system with developer every time a completion is requested, and it's not very efficient. And yes, as you pointed out some heuristics to understand the model from the deployment name needs to be in place.

I understand that the developer role is specifically to OpenAI, and adding it in the AI.Abstraction project is not clean architecture-wise, but leaving the choice to the end developer, it could be the best/safest option we have.

I still haven't really tested if it works (access to o1 requires spending 1k in OpenAI credits, and the non-preview version is still not available in Azure), but for now I'm using this code:

var DevRole = new ChatRole("developer"); List<ChatMessage> chatMessages = [new ChatMessage(DevRole, DeveloperPromptO1)]; chatMessages.Add(new ChatMessage(ChatRole.User, userInput)); var asyncUpdates = OpenAIHelpers.OpenAIClient.CompleteStreamingAsync(chatMessages); await foreach (var update in asyncUpdates) { response.Append(update); Console.Write(update); } // Streaming doesn't add Assistant replies to the history, so we add the last one here chatMessages.Add(new ChatMessage(ChatRole.Assistant, response.ToString()));

@stephentoub / @SteveSandersonMS: do you agree?

Also, we would have to scan through the full list of chat messages replacing system with developer every time a completion is requested, and it's not very efficient

There's no meaningful additional overhead here. Every request already entails translating from the M.E.AI object to the target, either to some other object model or to the request body being written out. It already needs to look at every message's role, it's just a question of what it translates it to.

but leaving the choice to the end developer, it could be the best/safest option we have

M.E.AI.Abstractions provides an abstraction. It needs to be mapped to the underlying target and is rarely always 1:1. It's part of the deal that things may not be represented in the abstraction exactly how they are under the covers.

I still haven't really tested if it works

It will not work as intended today, as the OpenAI 2.1.0 package has no knowledge of this new role type.

I suggest efforts at this point would be better spent learning why it is OpenAI felt a need to create this new role type. At the moment it seems like it's simply a rename, which makes the current decisions around not having any compatibility between "system" and "developer" very confusing.

Regardless, while I do appreciate the effort, the PR as it currently stands is not a complete solution and not something to be merged. It's adding a public ChatRole that won't be appropriately recognized by the only target that could even support it today.

dotnet-comment-bot · 2024-12-20T19:09:01Z

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project	Expected	Actual
Microsoft.Extensions.AI	88	89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=900944&view=codecoverage-tab

Update ChatRole.cs

357b484

Adds `developer` role for `o1` and newer models.

Licantrop0 requested a review from a team as a code owner December 19, 2024 00:35

dotnet-policy-service bot assigned Licantrop0 Dec 19, 2024

stephentoub reviewed Dec 19, 2024

View reviewed changes

Licantrop0 added 2 commits December 20, 2024 18:19

added link to model spec, fixed spacing.

bbc249f

Merge branch 'main' into patch-1

82b80b7

jeffhandley added the area-AI label Jan 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update ChatRole.cs #5753

Update ChatRole.cs #5753

Licantrop0 commented Dec 19, 2024 •

edited

Loading

stephentoub Dec 19, 2024 •

edited

Loading

Licantrop0 Dec 19, 2024 •

edited

Loading

SteveSandersonMS Dec 19, 2024

Licantrop0 Dec 19, 2024 •

edited

Loading

SteveSandersonMS Dec 20, 2024

Licantrop0 Dec 20, 2024

stephentoub Dec 20, 2024

dotnet-comment-bot commented Dec 20, 2024

Update ChatRole.cs #5753

Are you sure you want to change the base?

Update ChatRole.cs #5753

Conversation

Licantrop0 commented Dec 19, 2024 • edited Loading

Microsoft Reviewers: Open in CodeFlow

stephentoub Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Licantrop0 Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

SteveSandersonMS Dec 19, 2024

Choose a reason for hiding this comment

Licantrop0 Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

SteveSandersonMS Dec 20, 2024

Choose a reason for hiding this comment

Licantrop0 Dec 20, 2024

Choose a reason for hiding this comment

stephentoub Dec 20, 2024

Choose a reason for hiding this comment

dotnet-comment-bot commented Dec 20, 2024

Licantrop0 commented Dec 19, 2024 •

edited

Loading

stephentoub Dec 19, 2024 •

edited

Loading

Licantrop0 Dec 19, 2024 •

edited

Loading

Licantrop0 Dec 19, 2024 •

edited

Loading