Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otel-data] Add @timestamp as sort field for logs and traces #114049

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gregkalapos
Copy link
Contributor

Solves #113397.

@gregkalapos gregkalapos added >non-issue :Data Management/Data streams Data streams and their lifecycles auto-backport-and-merge Automatically create backport pull requests and merge when ready v8.16.0 labels Oct 3, 2024
@gregkalapos gregkalapos self-assigned this Oct 3, 2024
@gregkalapos gregkalapos requested a review from a team as a code owner October 3, 2024 17:19
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team external-contributor Pull request authored by a developer outside the Elasticsearch team v9.0.0 labels Oct 3, 2024
@@ -8,7 +8,7 @@ template:
index:
mode: logsdb
sort:
field: [ "resource.attributes.host.name" ]
field: [ "@timestamp", "resource.attributes.host.name" ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need resource.attributes.host.name? Why was it there in the first place?

Copy link
Contributor Author

@gregkalapos gregkalapos Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what i understand the default for this setting is @timestamp and host.name. I took this from @felixbarny in #113397

So our goal is to just mimic the default.

I think alias passthrough does not help here, so we essentially just mimic the default, but in the otel case host.name is resource.attributes.host.name.

So I think we still want it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default sorts first by host name, then by timestamp. That’s an important detail that we should keep. The result is that the metadata for that host can be deduplicated because there are a a lot of the same values next to each other.

If we sort by timestamp first, only documents with the exact same timestamp are sorted by host name.

@@ -70,3 +70,24 @@ setup:
- match: { hits.hits.0.fields.error\.exception\.type: ["MyException"] }
- match: { hits.hits.0.fields.error\.exception\.message: ["foo"] }
- match: { hits.hits.0.fields.error\.stack_trace: ["Exception in thread \"main\" java.lang.RuntimeException: Test exception\n at com.example.GenerateTrace.methodB(GenerateTrace.java:13)\n at com.example.GenerateTrace.methodA(GenerateTrace.java:9)\n at com.example.GenerateTrace.main(GenerateTrace.java:5)"] }
---
"@timestamp should be used as sort field":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about these tests. It feels like they may rely too much on an implementation details that _search returns documents in order of how they're indexed. I don't think there's a guarantee on the ordering of search results when no sorting or scoring is involved. We're not sorting because we want the search results to return in a specific order but because it has benefits wrt storage savings. A unit tests isn't the best way to verify that. So maybe just check that the index setting has been applied or even just remove the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready :Data Management/Data streams Data streams and their lifecycles external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue Team:Data Management Meta label for data/management team v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants