AI Agent Span Semantic Convention #1657

gyliu513 · 2024-12-06T16:49:02Z

Fixed part of #1530

Changes

Please provide a brief description of the changes here.

Note: if the PR is touching an area that is not listed in the existing areas, or the area does not have sufficient domain experts coverage, the PR might be tagged as experts needed and move slowly until experts are identified.

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
schema-next.yaml updated with changes to existing conventions.

gyliu513 · 2024-12-06T16:52:01Z

@lmolkova @lzchen @nirga @karthikscale3 @drewby this is the very draft version, we may need a long discussion for this, hope we can start from here.

Please share your comments here, actually, I do not know if we want to put the ai agent semantic convention to same folder as gen-ai or we need a new folder for ai-agent. Thanks!

lmolkova

A few general points:

we should not create attributes that would be the same as existing gen_ai attributes. We should use those instead of defining agent ones by default
we need to define everything in yaml and stay compatible with the schema

I have a draft here - microsoft#3 for OpenAI assistant-like API which covers a lot of similar things, PTAL

lmolkova · 2024-12-15T22:49:04Z

docs/ai-agent/ai-agent-spans.md

+
+| Attribute                      | Type   | Description                                | Example                          | Requirement Level | Stability    |
+| ------------------------------ | ------ | ------------------------------------------ | -------------------------------- | ----------------- | --- |
+| `ai_agent.agent.name`          | string | Name of the agent.                         | `Researcher Bot`                 | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |


Those are still genai agents, so I think this out should be gen_ai.agent.name

lmolkova · 2024-12-15T22:49:29Z

docs/ai-agent/ai-agent-spans.md

+| `ai_agent.agent.role`          | string | Role assigned to the agent.                | `Data Collector`                 | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.backstory`     | string | Background story or context for the agent. | `Specializes in web data mining` | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.workflow_name` | string | Name of the workflow the agent is part of. | `Data Processing Pipeline`       | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.model`         | string | Underlying model powering the agent.       | `gpt-4`                          | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |


how's agent model is different from gen_ai.request.model?

lmolkova · 2024-12-15T22:49:49Z

docs/ai-agent/ai-agent-spans.md

+| `ai_agent.agent.backstory`     | string | Background story or context for the agent. | `Specializes in web data mining` | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.workflow_name` | string | Name of the workflow the agent is part of. | `Data Processing Pipeline`       | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.model`         | string | Underlying model powering the agent.       | `gpt-4`                          | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.tools`         | array  | List of tools available to the agent.      | `["Web Scraper", "Analyzer"]`    | Recommended       | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |


how'd it be different from generic gen_ai tool?

svrnm · 2024-12-16T06:33:00Z

hey there,

I think this is a great starting point for a very hot topic right now. There is one major comment I have on that, and it's around the workflow and task. If I understand it correctly there is a workflow to task relationship and the task, I would assume they are modelled in a parent span, span relationship? So a workflow is a "parent span" and then tasks are "child spans" to that, is this a correct assumption?

If that's correct and if I look at your examples, a workflow (and maybe even a task?) can be very long running, which is a currently unsolved piece of the otel specification, so maybe this need for AI Agents being modelled could help to be a driven force behind providing a beter specification for that, because I additionally see workflow and task not being unique to AI agents, see CICD pipeline attributes for example.

gyliu513 · 2024-12-16T20:29:00Z

If that's correct and if I look at your examples, a workflow (and maybe even a task?) can be very long running, which is a currently open-telemetry/opentelemetry-specification#373 piece of the otel specification, so maybe this need for AI Agents being modelled could help to be a driven force behind providing a beter specification for that, because I additionally see workflow and task not being unique to AI agents, see CICD pipeline attributes for example.

@svrnm Yes, this is the case, at least from my point of view, the workflow and task relationship is very similar as the CICD pipeline attributes.

Let me review microsoft#3 from @lmolkova first, and I will try to update my PR soon after some discussion on microsoft#3

we should not create attributes that would be the same as existing gen_ai attributes. We should use those instead of defining agent ones by default
we need to define everything in yaml and stay compatible with the schema

@lmolkova yes, let me consolidate the agent attributes to gen_ai, but let me first go through you PR microsoft#3 first, thanks!

karthikscale3 · 2024-12-16T20:42:39Z

hey there,

I think this is a great starting point for a very hot topic right now. There is one major comment I have on that, and it's around the workflow and task. If I understand it correctly there is a workflow to task relationship and the task, I would assume they are modelled in a parent span, span relationship? So a workflow is a "parent span" and then tasks are "child spans" to that, is this a correct assumption?

If that's correct and if I look at your examples, a workflow (and maybe even a task?) can be very long running, which is a currently unsolved piece of the otel specification, so maybe this need for AI Agents being modelled could help to be a driven force behind providing a beter specification for that, because I additionally see workflow and task not being unique to AI agents, see CICD pipeline attributes for example.

@svrnm / @gyliu513 - Just wanted to add my 2 cents here. This is very much an issue and thanks for bringing it up. The OTEL instrumentation we have(at Langtrace) for frameworks like CrewAI, DSPy etc. runs into this issue from time to time where traces have, in occasions 100s of spans as part of the same trace. An option to flush spans in progress will be ideal for these scenarios so the user can see realtime feedback on the UI for ongoing agentic sessions. Having said that, a vast majority of the agents we are seeing(from our perspective) still work well with the existing capabilities. But, we definitely need to think about this sooner than later.

karthikscale3 · 2024-12-16T20:44:19Z

docs/ai-agent/ai-agent-spans.md

+| ------------------------------ | ------ | ------------------------------------------ | -------------------------------- | ----------------- | --- |
+| `ai_agent.agent.name`          | string | Name of the agent.                         | `Researcher Bot`                 | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.role`          | string | Role assigned to the agent.                | `Data Collector`                 | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.agent.backstory`     | string | Background story or context for the agent. | `Specializes in web data mining` | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |


AFAICT, role and backstory are very crewAI concepts that may or may not be applicable to other frameworks. So maybe we should consider making them specific to a crewAI namespace?

karthikscale3 · 2024-12-16T20:45:17Z

docs/ai-agent/ai-agent-spans.md

+
+| Attribute                   | Type    | Description                                                              | Example                            | Requirement Level | Stability    |
+| --------------------------- | ------- | ------------------------------------------------------------------------ | ---------------------------------- | ----------------- | --- |
+| `ai_agent.task.name`        | string  | Name of the task.                                                        | `Data Collection`                  | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |


do we also need a ai_agent.task.input in addition to these?

I'm curious if we can consolidate it under gen_ai.system|user|tool|assistant.message events rather than attirbutes

karthikscale3 · 2024-12-16T20:45:27Z

docs/ai-agent/ai-agent-spans.md

+| Attribute                | Type   | Description                                  | Example           | Requirement Level | Stability    |
+| ------------------------ | ------ | -------------------------------------------- | ----------------- | ----------------- | --- |
+| `ai_agent.tool.name`     | string | Name of the tool utilized by the agent.      | `Web Scraper`     | Required          | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |
+| `ai_agent.tool.function` | string | Specific function or capability of the tool. | `Data Extraction` | Recommended       | ![Experimental](https://img.shields.io/badge/-experimental-blue)    |


do we also need a ai_agent.tool.input in addition to these?

karthikscale3 · 2024-12-16T20:46:12Z

Nice first draft @gyliu513 . thanks for starting this.

codefromthecrypt · 2024-12-27T00:17:45Z

@gyliu513 I'm giving you unsolicited advice and being intentionally not specific to a change here, because I think more research would lead you to your own changes. That's always the best (in my mind). Hope it helps.

One of the main gains we had in re-organizing the llm now genai sig to have a space in otel-contrib python was to be able to practice specs before committing to it. I have seen this in practice done in java and it helps quite a bit.

Are you keen on instrumenting a draft PR on some open source agent library you believe is valid for this semconv
How about an example PII washed feed from agentic cloud provider data, which would translate

Another thing to guide is especially bookend timestamps sounds like a discussion that would have happened here in another domain (start_xx end_xx). Certainly, it happened way back in zipkin days with "cs" "cr" though these were separate events. Can you research some prior work in otel where a spec like this was accepted or denied?

gyliu513 · 2024-12-27T03:19:48Z

@codefromthecrypt good comment, thanks and happy holidays!

How about an example PII washed feed from agentic cloud provider data, which would translate

Can you please share more detail for your comment here?

I was now reviewing microsoft#3 and this PR really helped a lot, I will probably update my PR soon after new year based on microsoft#3.

codefromthecrypt · 2024-12-27T03:35:59Z

@gyliu513 for this comment I made "How about an example PII washed feed from agentic cloud provider data, which would translate"

What I mean is that we most of the time assume the data is coming from the application. Like we instrument langchain or something and spans and metrics are collected directly from the app.

While I don't know what services exist, another way is cloud integration, where a platform is generating the signals. One example is AWS Bedrock, where you can get data regardless of what the developers do https://aws.amazon.com/blogs/mt/monitoring-generative-ai-applications-using-amazon-bedrock-and-amazon-cloudwatch-integration/

So, for this PR, I mean that if its scope is only for application instrumentation, then we should look at which frameworks we are considering and maybe a draft/experiment/proof of concept that exercises the specs you are making.

Beyond that, if you are thinking about a specific cloud integration (I don't know if you are), some sample data or documentation on what that agentic feed looks like could help us translate if the semantic conventions here are valid for it or not.

Does that help? If not you can also quiz me on slack, but anyway happy holidays!

gyliu513 requested review from a team as code owners December 6, 2024 16:49

gyliu513 marked this pull request as draft December 6, 2024 16:49

AI Agent Span Semantic Convention

2e92a1c

gyliu513 force-pushed the agent branch from 7849133 to 2e92a1c Compare December 6, 2024 16:58

lmolkova requested changes Dec 15, 2024

View reviewed changes

svrnm mentioned this pull request Dec 16, 2024

Unified semantic conventions for tasks, workflows, pipelines, jobs #1688

Open

karthikscale3 reviewed Dec 16, 2024

View reviewed changes

gyliu513 mentioned this pull request Dec 18, 2024

Define client spans for Generative AI agents microsoft/opentelemetry-semantic-conventions#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Agent Span Semantic Convention #1657

AI Agent Span Semantic Convention #1657

gyliu513 commented Dec 6, 2024

gyliu513 commented Dec 6, 2024

lmolkova left a comment

lmolkova Dec 15, 2024

lmolkova Dec 15, 2024

karthikscale3 Dec 16, 2024

lmolkova Dec 15, 2024

svrnm commented Dec 16, 2024

gyliu513 commented Dec 16, 2024 •

edited

Loading

karthikscale3 commented Dec 16, 2024

karthikscale3 Dec 16, 2024

karthikscale3 Dec 16, 2024

lmolkova Dec 16, 2024

karthikscale3 Dec 16, 2024

karthikscale3 commented Dec 16, 2024

codefromthecrypt commented Dec 27, 2024

gyliu513 commented Dec 27, 2024

codefromthecrypt commented Dec 27, 2024

AI Agent Span Semantic Convention #1657

Are you sure you want to change the base?

AI Agent Span Semantic Convention #1657

Conversation

gyliu513 commented Dec 6, 2024

Changes

Merge requirement checklist

gyliu513 commented Dec 6, 2024

lmolkova left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svrnm commented Dec 16, 2024

gyliu513 commented Dec 16, 2024 • edited Loading

karthikscale3 commented Dec 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karthikscale3 commented Dec 16, 2024

codefromthecrypt commented Dec 27, 2024

gyliu513 commented Dec 27, 2024

codefromthecrypt commented Dec 27, 2024

gyliu513 commented Dec 16, 2024 •

edited

Loading