-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: produce OpenTelemetry traces with hs-opentelemetry
#3140
base: main
Are you sure you want to change the base?
Conversation
Awesome work! 🔥 🔥
Found this Nix flake that contains an OTel GUI: https://flakestry.dev/flake/github/FriendsOfOpenTelemetry/opentelemetry-nix/1.0.1 I'll try to integrate that once the PR is ready for review. |
8c0e16a
to
64a0ee9
Compare
The recent problem I'm seemingly stuck with is There's a more straightforward It also seems to boil down to the conceptual choice between online and offline traces' delivery-wise, or push and pull model. @steve-chavez @wolfgangwalther @laurenceisla what do you think guys? |
@develop7 Would vault help? It was introduced on #1988, I recall it helped with IORef handling. It's still used on postgrest/src/PostgREST/Auth.hs Lines 160 to 165 in d2fb67f
I'm still not that familiar with OTel but the basic idea I had was to store these traces on AppState and export them async. |
6b891c2
to
586e7a1
Compare
Not only that, you want traces in tests too, for one. The good news is
Good call @steve-chavez, thank you for the suggestion. Will try too. |
0830a45
to
dc882f1
Compare
Since now we have an postgrest/src/PostgREST/App.hs Lines 170 to 172 in 229bc77
postgrest/src/PostgREST/Observation.hs Lines 15 to 18 in 229bc77
Perhaps we can add some observations for the timings? Also the Logger is now used like: postgrest/src/PostgREST/Logger.hs Lines 53 to 54 in 7c6c056
postgrest/src/PostgREST/CLI.hs Line 50 in 7c6c056
For OTel, maybe the following would make sense: otelState <- Otel.init
App.run appState (Logger.logObservation loggerState >> OTel.tracer otelState)) |
dc882f1
to
7794848
Compare
Agreed, server timings definitely belong there. |
7794848
to
398206b
Compare
398206b
to
4cd99c6
Compare
Okay, the PR is in the cooking for long enough, let's pull the plug and start small. Let's have it reviewed while I'm fixing the remaining CI failures. |
4cd99c6
to
94d2b9b
Compare
I don't think we depend on this in the current state. And we should certainly not depend on an even-less-maintained fork of the same. So to go forward here, there needs to be some effort put into the upstream package first, to make it usable for us. |
590d142
to
e809a65
Compare
A status update:
|
Hm. I looked at your fork. It depends on support for GHC 9.8 in I guess for GHC 9.8 support it's just a matter of time. What about the other issues mentioned above? Were you able to make progress on those? |
In my prototype I actually played with replacing HASQL Session with an https://github.com/haskell-effectful/effectful based monad to make it extensible: https://github.com/mkleczek/hasql-api/blob/master/src/Hasql/Api/Eff/Session.hs#L37 Using it in PostgREST required some mixins usage in Cabal: 29b946e#diff-eb6a76805a0bd3204e7abf68dcceb024912d0200dee7e4e9b9bce3040153f1e1R140 Some work was required in PostgREST startup/configuration code to set-up appropriate effect handlers and middlewares but the changes were quite well isolated. At the end of the day I think basing your monad stack on an effect library (effectful, cleff etc.) is the way forward as it makes the solution highly extensible and configurable. |
e809a65
to
4697009
Compare
650d008
to
ac33872
Compare
Update: rebased the PR against latest |
Building |
I still wonder if we could do this in a less invasive way 🤔. We added Prometheus metrics in a simple way by using our Observation module: postgrest/src/PostgREST/AppState.hs Line 133 in 2564b32
Ideally we would add an Otel module, add an I'm not sure if this can be done entirely with It seems better to maintain some of this functionality in-tree, specially since the dependency is not fully featured yet. That would also give us better control. |
@steve-chavez this could be even more preferable since every
It seems so — I didn't sent any of these traces to honeycomb myself, so. I might only need to fiddle with the call stack so it won't have |
|
Nix builds still fail because nixpkgs we're using do not have updated |
2c58608
to
d821d86
Compare
Oh and all the checks are officially green |
Enough is enough; let's have the feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a certain slowdown noticeable in the loadtest reports. I guess that's expected, but can we do any better?
I didn't look at the code at all, I'll leave that to @steve-chavez.
dc4e9a7
to
6afc7cd
Compare
6afc7cd
to
c065a27
Compare
There is, and it's noticeable indeed, and it's not expected by me particularly: I'm assuming unconfigured oTel should be no-op or close to it. Let me check their defaults. |
Update reference memory usage accordingly
@steve-chavez without fiddling with CallStack this would report incorrect code locations, just the way it does at the moment.
@wolfgangwalther the tracing was enabled by default so I've disabled it for memory tests, shaving off a ~megabyte per test. |
Hm, besides code maintenance, the way I suggested would also keep the tracing out of the fast path. As it is, we lose much perf (ref) which doesn't seem worth it.
The Observations could contain the original lines as an extra field, and I guess we could send these to the OTel server? |
To be fair, OTel adds up only 5% on top of another 10%
that's one way to address it indeed; it's just 3rd-party tools would most likely be confused since they expect a well-known attribute name. |
IIRC, those were specific to the root endpoint requests - but since we don't split the loadtest data by endpoint, you can't easily tell that. So not critical. But the OTel stuff is adding it to every request, right? |
@wolfgangwalther yes, it did by default. Disabling it in |
This PR introduces producing OpenTelemetry traces containing, among others, metrics same as in
ServerTiming
header from before.TODO:
build with Nix as well (for now Stack only)DONE for freemake an example of exporting log messageshs-opentelemetry
doesn't support logging, per Logging roadmap iand675/hs-opentelemetry#100getTracer
available globally: we're interested in using as many different spans as it makes sense, sogetTracer
should be available everywhere, as described inhs-opentelemetry-sdk
's READMEhs-opentelemetry-wai
middlewarelook into failing Windows buildshs-opentelemetry-sdk
depends onunix
, tracking in Windows support iand675/hs-opentelemetry#109Running:
I sort of gave up deploying and configuring all the moving bits locally, so you'd need to create the honeycomb.io account for this one (or ask me for the invite). After that, it's quite straightforward:
stack build
, and get its path withstack exec -- which postgrest
nix-shell
, thenpostgrest-with-postgresql-15 -- cat
)postgrest_test_anonymous
for the example DB from the above example)