Composite policy doesn't allow for overflow between subpolicies if a policy sets Sampled but has gone over its max rate #36959
Labels
bug
Something isn't working
needs triage
New item requiring triage
processor/tailsampling
Tail sampling processor
Component(s)
processor/tailsampling
What happened?
Description
Within the algorithm for processing of subpolicies within the composite policy, the following may happen:
Sampled
NotSampled
and exit the function, thereby never checking any of the other subpolicies.My personal opinion is that this is a bug, but I can see how this might be construed as a feature.
Steps to Reproduce
Expected Result
The always sample subpolicy (ASSP) receives the traces first, and any unsampled traces then move on to the probabilistic subpolicy (PSP).
If a ramping increase of spans/sec is used, we will at first see a 100% sampling rate for this Composite policy followed by an inflection point where the ASSP saturates at 100 spans/sec and 80 % of spans above this point are not sampled. Similarly, when we look at the exporter's rate/sec rate, we can see that there is an inflection at around this time, where the slope of the graph after the inflection is 20% of the slop before.
Actual Result
ASSP always receives all traces and PSP never receives any traces. The spans per second flatlines at the maximum number of spans per second to be allocated to ASSP
Collector version
v0.116.0
Environment information
Environment
OS: Ubuntu 22.04
Compiler(if manually compiled): [email protected] go1.23.4
OpenTelemetry Collector configuration
Log output
spansInSecondIfSampled <= c.maxTotalSPS
The text was updated successfully, but these errors were encountered: