Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Prototype Entities (otep#264) in Java SDK #6855

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

jsuereth
Copy link
Contributor

@jsuereth jsuereth commented Nov 7, 2024

This is a prototype of Java SDK updates for Entities OTEP.

  • Adds Entity, EntityBuilder, EntityDetector to sdk common package.
  • Adds entity related methods to Resource and ResourceBuilder. Updates merge logic to match OTEP specification.
  • Creates a ResourceProvider which attempts to centralize Resource/Entity logic. This will need to be sorted out with existing sdk-extension ResourceProviders.
  • Updates unit tests do to auto-generated toString changes.

Note: This prototype does NOT attempt to isolate experimental features into internal directories and expose an experimental package. That would be done when fragmenting this prototype into components and submitting "final" variants w/ specification work.

@jsuereth jsuereth force-pushed the wip-resource-updates branch from 4d81df2 to b7421c2 Compare November 7, 2024 14:53
Copy link

codecov bot commented Nov 7, 2024

Codecov Report

Attention: Patch coverage is 83.07087% with 43 lines in your changes missing coverage. Please review.

Project coverage is 90.03%. Comparing base (7829f53) to head (cdd5f79).
Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
...o/opentelemetry/sdk/resources/ResourceBuilder.java 55.88% 14 Missing and 1 partial ⚠️
...k/resources/detectors/ServiceInstanceDetector.java 0.00% 12 Missing ⚠️
.../java/io/opentelemetry/sdk/resources/Resource.java 90.24% 4 Missing and 4 partials ⚠️
...lemetry/sdk/resources/ResourceProviderBuilder.java 68.75% 4 Missing and 1 partial ⚠️
...rter/internal/otlp/ResourceEntityRefMarshaler.java 91.17% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6855      +/-   ##
============================================
- Coverage     90.23%   90.03%   -0.21%     
- Complexity     6594     6669      +75     
============================================
  Files           729      740      +11     
  Lines         19800    20074     +274     
  Branches       1947     1983      +36     
============================================
+ Hits          17867    18074     +207     
- Misses         1341     1397      +56     
- Partials        592      603      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variety of comments to get the conversation started. I think there still a fair bit to decide in terms of which artifacts these new components live in and to what extent we need to extend the public API, but let's keep the conversation going.

static final Entity create(
String entityType,
Attributes identifying,
Attributes descriptive,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we call these descriptive here, should the getAttributes() method be renamed to getDescriptiveAttributes()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this a bit in the Entities WG. I think yes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be an option somewhere to strip all the descriptive attributes if we send entity events anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be an option somewhere to strip all the descriptive attributes if we send entity events anyway?

That'd be follow on work in ResourceProvider

/**
* Modify the descriptive attributes of this Entity.
*
* @param f A thunk which manipulates descriptive attributes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a thunk?

Separately, what's the motivation behind a consumer rather than the simpler withDescriptive(Attributes atributes)? The consumer pattern is more clever and lower overhead, but is less intuitive and this probably isn't a place where efficiency matter.s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just me taking shortcuts in adding all the various methods. I think if we submit this we should expand to ALL relevant methods of add{Identifying|Descriptive}Attribute{s?}.

A "thunk" is a term in other languages for first-class functions, lambdas, etc. I can update this API to actually be the one we'd expose. For now, just assume you have a bunch of reasonable methods to add attributes :).

* the SDK (called "associated entities"). For Example, if the SDK is running in a kubernetes pod,
* it may provide an Entity for that pod.
*/
public interface EntityDetector {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the thinking behind making EntityDetector a part of the SDK, instead of part of opentelemetry-sdk-extension-autoconfigure-spi like ResourceProvider?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term direction is for ResourceProvider to become part of SDK and allow mutating the set of entities or descriptive attributes over time.

I.e. ResourceProvider might provide something like an observer-pattern for gaining access to the current definition of resource. You can then pass this to {Signal}Provider.

Initially we don't need that and I expect ResourceProvider to be an internal detail.

*
* @return a list of discovered entities.
*/
List<Entity> detectEntities();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a single entity detector return multiple entities or a single entity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple, as is the case today for most non-otel detectors.

E.g. k8s entity detector may detect all relevant entities (k8s.cluster, k8s.namespace, k8s.pod, etc.). AWS/GCP/Azure/Other Cloud detector might find cloud entity in addition to host, faas, etc.

package io.opentelemetry.sdk.resources;

/** A Registry which provides the {@link Resource} to the SDK. */
public final class ResourceProvider {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have ResourceProvider and will likely need to pick a different name.

Separately, I'm a little confused on the purpose of this thing. Seems like we could get away extending ResourceBuilder to include entity concepts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's called out in the OTEP: https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/entities/0264-resource-and-entities.md#resource-provider

Initially it doesn't do much, eventually it'll be critical for client-side handling (e.g. Android SDK)

// }

private static String getServiceName() {
return System.getenv().getOrDefault("OTEL_SERVICE_NAME", "unknown_service:java");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We separate out the interpretation of environment variables into autoconfigure, which has proved to be an separation of responsibilities. Should put all built-in entity detectors over there.

Copy link
Contributor Author

@jsuereth jsuereth Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to give me a solution to this then: https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/common/src/main/java/io/opentelemetry/sdk/resources/Resource.java#L51. Effectively your "required" entity will be service from the SDK. I can have the baked in discover a broken one, but then we need to make sure it appropriately gets overriden in all scenarios.

Additionally, be ready for: https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/entities/0264-resource-and-entities.md#environment-variable-detector. This explicitly not a configuration-based ENV variable. This is a "carrier" of context for environments (e.g. K8S, FAAS, etc.) to provide identity to things they run. I think we should probably talk about how best to encode this in Java. I'd rather not have this rely on autoconfigure, as that implies it's a configuration thing.


public final class ServiceInstanceDetector implements EntityDetector {
private static final String SCHEMA_URL = "https://opentelemetry.io/schemas/1.28.0";
private static final String ENTITY_TYPE = "service.instance";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these entity types and their associated identifying / descriptive attributes codified anywhere yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does NOT update Resource.merge for entities yet.
This has many inefficiencies that can be optimised away.

- Adds Entity + EntityBuilder for constructing entities in the SDK
- Add EntityDetector for discovering entities
- Moves `service` and `telemetry.sdk` detection to EntityDetectors
- Updates to semconv 1.28.0 for service/telemetry.sdk
- Updates Resource + ResourceBuilder to preserve Entity
- Creates a ResourceProvider that can be used to construct Resource
  using entity merge rules.
- Hacky `getAttributes` update on Resource to preserve existing
  behavior.

Tests pending.
@jsuereth jsuereth force-pushed the wip-resource-updates branch from 144e141 to 8b4f48f Compare November 27, 2024 16:07
public abstract Attributes getIdentifyingAttributes();

/**
* Returns a map of attributes that describe the entity.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this All or descriptive? If it is just descriptive, wouldn't we expect the function name to be getDescriptiveAttributes. Also, the comment should explain what the terms mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's descriptive. We discussed this two weeks ago, and yes this will get renamed.

As for javadoc - note: this is a prototype. The documentation isn't official and the level of detail we need to flesh out will be more when officially adding to Java's SDK. Additionally, this would not be adopted AS-IS because we'd need to use a mechanism where experimental components do not show up in stable portions of the SDK. This is meant to be a vision of the long-term stable state.

* say that the Process entity is related to the Host entity.
*/
@Immutable
@AutoValue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that the generated equals is really correct here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update equals to ignore descriptive attributes, but for the purpose of how this is used, it's fine. Specifically, we need to understand if A != B but A's identity == B's identity when doing merge logic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move to @APinote

*
* @return the entity type.
*/
public abstract String getType();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have type and schema url? there is also no context link in the sdk which looks wrong; I would have expected that one entity giving context to another implies having such links in at least one direction in memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context is implicit by notion they're both on the resource.

We haven't officially created the relationship aspect of the entity signal, so that's not here yet, nor is it clear where that will show up.

static final Entity create(
String entityType,
Attributes identifying,
Attributes descriptive,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be an option somewhere to strip all the descriptive attributes if we send entity events anyway?

public Attributes getAttributes() {
// TODO - cache this.
AttributesBuilder result = Attributes.builder();
getEntities()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for(var e : getEntities) is shorter and produces much better code

public abstract Attributes getAttributes();
// TODO - making this final breaks binary compatibility checks.
public Attributes getAttributes() {
// TODO - cache this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be cached at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource is immutable. We can cache (or generate on construction) the resulting attribute list.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds wrong because we try to represent an underlying reality with changing descriptive attributes, right? Therefore, I would expect that entity pointers being constant but the resulting attribute set being subject to change.

*
* @return a collection of entities.
*/
public abstract Collection<Entity> getEntities();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LinkedHashSet?

private static final AttributeKey<String> SERVICE_INSTANCE_ID =
AttributeKey.stringKey("service.instance.id");

private static final UUID FALLBACK_INSTANCE_ID = UUID.randomUUID();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be an issue if multiple source do it. Has someone looked into how agents interact with this concept? Since someone from Dynatrace was in the call, we should maybe ask them to have a look how otel and Dynatrace agent will pick up default identifying attributes issued by the sdk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you mean here?

Service.instance.id is something we've had a lot of debates on across opentelemetry. This UUID behavior was the latest agreed to (but not yet implemented) semantic convention for it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not care about it being a UUID or something different. The fallback part is what worries me. To me, the code looks as if the SDK would issue instance IDs eventually. This is an issue because a) surrounding agents could also issue instance IDs b) the infrastructure could be set up to issue instance IDs later in some OTel collectors enriching the data.

import java.util.List;

/** Detects the `service` entity. */
public final class ServiceDetector implements EntityDetector {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other comments are just comments, this one is a real concern: At the moment an important equivalence relation on telemetry data is based on (name, namespace, instance.id). Hence, I would have assumed that these three fields would result in one "service" entity with version as descriptive attribute. @jsuereth could you point me to a document or discussion explaining the choice you made here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So two things:

  1. namespace + instance.id (and version) are not stable in semantic conventions yet. This is highly problematic and something we discussed in Entities SIG earlier. We need to address this, but you need to deal with that (short term) issue now. This code only leverages the stable semantic conventions.
  2. The plan was to have (name/namespace) uniquely define a "Service" with version as descriptive. "ServiceInstance" would be related to Service in that a Service is composed of ServiceInstance. This is a proposal I have for how to model otel's service within Entities. I.e. a service is a grouping of service.instances.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Reality is that people read only comment 2 in https://opentelemetry.io/docs/specs/semconv/resource/#service It does not really matter if some people try to restrict it any further. It has been like that since the very beginning and we should accept that it cannot be changed without breaking pretty much everything. While it is somewhat sad that this part is full of missed opportunities, that opportunities are missed unless there is some breaking OpenTelemetry V2.
  2. Instances have versions, services do not. If you run a service, you can have multiple instances with different versions in an environment. I do not see what the value of a service entity would be if the service instance entity would provide all the value and would have to copy all the identifying attributes from the service entity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants